Winter Special Sale Limited Time 60% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 2493360325

Good News !!! Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Databricks Certified Associate Developer for Apache Spark 3.0 Exam is now Stable and With Pass Result

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Practice Exam Questions and Answers

Databricks Certified Associate Developer for Apache Spark 3.0 Exam

Last Update 1 day ago
Total Questions : 180

Databricks Certified Associate Developer for Apache Spark 3.0 Exam is stable now with all latest exam questions are added 1 day ago. Incorporating Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 practice exam questions into your study plan is more than just a preparation strategy.

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 exam questions often include scenarios and problem-solving exercises that mirror real-world challenges. Working through Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 dumps allows you to practice pacing yourself, ensuring that you can complete all Databricks Certified Associate Developer for Apache Spark 3.0 Exam practice test within the allotted time frame.

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 PDF

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 PDF (Printable)
$50
$124.99

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Testing Engine

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 PDF (Printable)
$58
$144.99

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 PDF + Testing Engine

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 PDF (Printable)
$72.8
$181.99
Question # 1

Which of the following code blocks returns about 150 randomly selected rows from the 1000-row DataFrame transactionsDf, assuming that any row can appear more than once in the returned

DataFrame?

Options:

A.  

transactionsDf.resample(0.15, False, 3142)

B.  

transactionsDf.sample(0.15, False, 3142)

C.  

transactionsDf.sample(0.15)

D.  

transactionsDf.sample(0.85, 8429)

E.  

transactionsDf.sample(True, 0.15, 8261)

Discussion 0
Question # 2

The code block displayed below contains one or more errors. The code block should load parquet files at location filePath into a DataFrame, only loading those files that have been modified before

2029-03-20 05:44:46. Spark should enforce a schema according to the schema shown below. Find the error.

Schema:

1.root

2. |-- itemId: integer (nullable = true)

3. |-- attributes: array (nullable = true)

4. | |-- element: string (containsNull = true)

5. |-- supplier: string (nullable = true)

Code block:

1.schema = StructType([

2. StructType("itemId", IntegerType(), True),

3. StructType("attributes", ArrayType(StringType(), True), True),

4. StructType("supplier", StringType(), True)

5.])

6.

7.spark.read.options("modifiedBefore", "2029-03-20T05:44:46").schema(schema).load(filePath)

Options:

A.  

The attributes array is specified incorrectly, Spark cannot identify the file format, and the syntax of the call to Spark's DataFrameReader is incorrect.

B.  

Columns in the schema definition use the wrong object type and the syntax of the call to Spark's DataFrameReader is incorrect.

C.  

The data type of the schema is incompatible with the schema() operator and the modification date threshold is specified incorrectly.

D.  

Columns in the schema definition use the wrong object type, the modification date threshold is specified incorrectly, and Spark cannot identify the file format.

E.  

Columns in the schema are unable to handle empty values and the modification date threshold is specified incorrectly.

Discussion 0
Question # 3

Which of the following code blocks reads in the JSON file stored at filePath as a DataFrame?

Options:

A.  

spark.read.json(filePath)

B.  

spark.read.path(filePath, source="json")

C.  

spark.read().path(filePath)

D.  

spark.read().json(filePath)

E.  

spark.read.path(filePath)

Discussion 0
Question # 4

Which of the following code blocks returns a new DataFrame with the same columns as DataFrame transactionsDf, except for columns predError and value which should be removed?

Options:

A.  

transactionsDf.drop(["predError", "value"])

B.  

transactionsDf.drop("predError", "value")

C.  

transactionsDf.drop(col("predError"), col("value"))

D.  

transactionsDf.drop(predError, value)

E.  

transactionsDf.drop("predError & value")

Discussion 0
Question # 5

Which of the following code blocks sorts DataFrame transactionsDf both by column storeId in ascending and by column productId in descending order, in this priority?

Options:

A.  

transactionsDf.sort("storeId", asc("productId"))

B.  

transactionsDf.sort(col(storeId)).desc(col(productId))

C.  

transactionsDf.order_by(col(storeId), desc(col(productId)))

D.  

transactionsDf.sort("storeId", desc("productId"))

E.  

transactionsDf.sort("storeId").sort(desc("productId"))

Discussion 0
Question # 6

The code block shown below should convert up to 5 rows in DataFrame transactionsDf that have the value 25 in column storeId into a Python list. Choose the answer that correctly fills the blanks in

the code block to accomplish this.

Code block:

transactionsDf.__1__(__2__).__3__(__4__)

Options:

A.  

1. filter

2. "storeId"==25

3. collect

4. 5

B.  

1. filter

2. col("storeId")==25

3. toLocalIterator

4. 5

C.  

1. select

2. storeId==25

3. head

4. 5

D.  

1. filter

2. col("storeId")==25

3. take

4. 5

E.  

1. filter

2. col("storeId")==25

3. collect

4. 5

Discussion 0
Question # 7

The code block displayed below contains an error. The code block should read the csv file located at path data/transactions.csv into DataFrame transactionsDf, using the first row as column header

and casting the columns in the most appropriate type. Find the error.

First 3 rows of transactions.csv:

1.transactionId;storeId;productId;name

2.1;23;12;green grass

3.2;35;31;yellow sun

4.3;23;12;green grass

Code block:

transactionsDf = spark.read.load("data/transactions.csv", sep=";", format="csv", header=True)

Options:

A.  

The DataFrameReader is not accessed correctly.

B.  

The transaction is evaluated lazily, so no file will be read.

C.  

Spark is unable to understand the file type.

D.  

The code block is unable to capture all columns.

E.  

The resulting DataFrame will not have the appropriate schema.

Discussion 0
Question # 8

The code block shown below should add a column itemNameBetweenSeparators to DataFrame itemsDf. The column should contain arrays of maximum 4 strings. The arrays should be composed of

the values in column itemsDf which are separated at - or whitespace characters. Choose the answer that correctly fills the blanks in the code block to accomplish this.

Sample of DataFrame itemsDf:

1.+------+----------------------------------+-------------------+

2.|itemId|itemName |supplier |

3.+------+----------------------------------+-------------------+

4.|1 |Thick Coat for Walking in the Snow|Sports Company Inc.|

5.|2 |Elegant Outdoors Summer Dress |YetiX |

6.|3 |Outdoors Backpack |Sports Company Inc.|

7.+------+----------------------------------+-------------------+

Code block:

itemsDf.__1__(__2__, __3__(__4__, "[\s\-]", __5__))

Options:

A.  

1. withColumn

2. "itemNameBetweenSeparators"

3. split

4. "itemName"

5. 4

(Correct)

B.  

1. withColumnRenamed

2. "itemNameBetweenSeparators"

3. split

4. "itemName"

5. 4

C.  

1. withColumnRenamed

2. "itemName"

3. split

4. "itemNameBetweenSeparators"

5. 4

D.  

1. withColumn

2. "itemNameBetweenSeparators"

3. split

4. "itemName"

5. 5

E.  

1. withColumn

2. itemNameBetweenSeparators

3. str_split

4. "itemName"

5. 5

Discussion 0
Question # 9

Which of the following describes Spark's way of managing memory?

Options:

A.  

Spark uses a subset of the reserved system memory.

B.  

Storage memory is used for caching partitions derived from DataFrames.

C.  

As a general rule for garbage collection, Spark performs better on many small objects than few big objects.

D.  

Disabling serialization potentially greatly reduces the memory footprint of a Spark application.

E.  

Spark's memory usage can be divided into three categories: Execution, transaction, and storage.

Discussion 0
Question # 10

Which of the following statements about Spark's configuration properties is incorrect?

Options:

A.  

The maximum number of tasks that an executor can process at the same time is controlled by the spark.task.cpus property.

B.  

The maximum number of tasks that an executor can process at the same time is controlled by the spark.executor.cores property.

C.  

The default value for spark.sql.autoBroadcastJoinThreshold is 10M

B.  

D.  

The default number of partitions to use when shuffling data for joins or aggregations is 300.

E.  

The default number of partitions returned from certain transformations can be controlled by the spark.default.parallelism property.

Discussion 0

Free Exams Sample Questions