HOTSPOT - You develop a dataset named DBTBL1 by using Azure Databricks. DBTBL1 contains the following columns: ✑ SensorTypeID ✑ GeographyRegionID ✑ Year ✑ Month ✑ Day ✑ Hour ✑ Minute ✑ Temperature ✑ WindSpeed ✑ Other You need to store the data to support daily incremental load pipelines that vary for each GeographyRegionID. The solution must minimize storage costs. How should you complete the code? To answer, select the...

Questions › Category: DP-203 › HOTSPOT – You develop a dataset named DBTBL1 by using Azure Databricks. DBTBL1 contains the following columns: ✑ SensorTypeID ✑ GeographyRegionID ✑ Year ✑ Month ✑ Day ✑ Hour ✑ Minute ✑ Temperature ✑ WindSpeed ✑ Other You need to store the data to support daily incremental load pipelines that vary for each GeographyRegionID. The solution must minimize storage costs. How should you complete the code? To answer, select the…

0 Vote Up Vote Down

Admin Staff asked 4 months ago

HOTSPOT -
You develop a dataset named DBTBL1 by using Azure Databricks.
DBTBL1 contains the following columns:
✑ SensorTypeID
✑ GeographyRegionID
✑ Year
✑ Month
✑ Day
✑ Hour
✑ Minute
✑ Temperature
✑ WindSpeed
✑ Other
You need to store the data to support daily incremental load pipelines that vary for each GeographyRegionID. The solution must minimize storage costs.
How should you complete the code? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:

Suggested Answer:

Box 1: .partitionBy -
Incorrect Answers:
✑ .format:
Method: format():
Arguments: "parquet", "csv", "txt", "json", "jdbc", "orc", "avro", etc.
✑ .bucketBy:
Method: bucketBy()
Arguments: (numBuckets, col, col..., coln)
The number of buckets and names of columns to bucket by. Uses Hive's bucketing scheme on a filesystem.
Box 2: ("Year", "Month", "Day","GeographyRegionID")
Specify the columns on which to do the partition. Use the date columns followed by the GeographyRegionID column.
Box 3: .saveAsTable("/DBTBL1")
Method: saveAsTable()
Argument: "table_name"
The table to save to.
Reference:
https://www.oreilly.com/library/view/learning-spark-2nd/9781492050032/ch04.html
https://docs.microsoft.com/en-us/azure/databricks/delta/delta-batch

This question is in DP-203 Data Engineering on Microsoft Azure Exam
For getting Microsoft Certified: Azure Data Engineer Associate Certificate

Disclaimers:
The website is not related to, affiliated with, endorsed or authorized by Microsoft.
The website does not contain actual questions and answers from Microsoft's Certification Exams.
Trademarks, certification & product names are used for reference only and belong to Microsoft.