A customer has a machine learning workflow that consists of multiple quick cycles of reads-writes-reads on Amazon S3. The customer needs to run the workflow on EMR but is concerned that the reads in subsequent cycles will miss new data critical to the machine learning from the prior cycles. How should the customer accomplish this?

QuestionsCategory: BDS-C00A customer has a machine learning workflow that consists of multiple quick cycles of reads-writes-reads on Amazon S3. The customer needs to run the workflow on EMR but is concerned that the reads in subsequent cycles will miss new data critical to the machine learning from the prior cycles. How should the customer accomplish this?
Admin Staff asked 3 months ago
A customer has a machine learning workflow that consists of multiple quick cycles of reads-writes-reads on
Amazon S3. The customer needs to run the workflow on EMR but is concerned that the reads in subsequent cycles will miss new data critical to the machine learning from the prior cycles.
How should the customer accomplish this?

A. Turn on EMRFS consistent view when configuring the EMR cluster.

B. Use AWS Data Pipeline to orchestrate the data processing cycles.

C. Set hadoop.data.consistency = true in the core-site.xml file.

D. Set hadoop.s3.consistency = true in the core-site.xml file.








 

Suggested Answer: A






This question is in BDS-C00 AWS Certified Big Data – Specialty Exam
For getting AWS Certified Big Data – Specialty Certificate



Disclaimers:
The website is not related to, affiliated with, endorsed or authorized by Amazon.
Trademarks, certification & product names are used for reference only and belong to Amazon.
The website does not contain actual questions and answers from Amazon's Certification Exam.
Question Tags:

Recommended

Welcome Back!

Login to your account below

Create New Account!

Fill the forms below to register

Retrieve your password

Please enter your username or email address to reset your password.