You need to develop a pipeline for processing data. The pipeline must meet the following requirements: ✑ Scale up and down resources for cost reduction ✑ Use an in-memory data processing engine to speed up ETL and machine learning operations. ✑ Use streaming capabilities ✑ Provide the ability to code in SQL, Python, Scala, and R Integrate workspace collaboration with Git What should you use?

QuestionsCategory: DP-200You need to develop a pipeline for processing data. The pipeline must meet the following requirements: ✑ Scale up and down resources for cost reduction ✑ Use an in-memory data processing engine to speed up ETL and machine learning operations. ✑ Use streaming capabilities ✑ Provide the ability to code in SQL, Python, Scala, and R Integrate workspace collaboration with Git What should you use?
Admin Staff asked 4 months ago
You need to develop a pipeline for processing data. The pipeline must meet the following requirements:
✑ Scale up and down resources for cost reduction
✑ Use an in-memory data processing engine to speed up ETL and machine learning operations.
✑ Use streaming capabilities
✑ Provide the ability to code in SQL, Python, Scala, and R
Integrate workspace collaboration with Git
 Image
What should you use?

A. HDInsight Spark Cluster

B. Azure Stream Analytics

C. HDInsight Hadoop Cluster

D. Azure SQL Data Warehouse

E. HDInsight Kafka Cluster

F. HDInsight Storm Cluster




 

Suggested Answer: A

Aparch Spark is an open-source, parallel-processing framework that supports in-memory processing to boost the performance of big-data analysis applications.
HDInsight is a managed Hadoop service. Use it deploy and manage Hadoop clusters in Azure. For batch processing, you can use Spark, Hive, Hive LLAP,
MapReduce.
Languages: R, Python, Java, Scala, SQL
You can create an HDInsight Spark cluster using an Azure Resource Manager template. The template can be found in GitHub.
References:
https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing

This question is in DP-200 Microsoft Azure Data Engineer Exam
For getting Microsoft Certified: Azure Data Engineer Associate Certificate



Disclaimers:
The website is not related to, affiliated with, endorsed or authorized by Microsoft. 
The website does not contain actual questions and answers from Microsoft's Certification Exams.
Trademarks, certification & product names are used for reference only and belong to Microsoft.

Recommended

Welcome Back!

Login to your account below

Create New Account!

Fill the forms below to register

Retrieve your password

Please enter your username or email address to reset your password.