Company A operates in Country X. Company A maintains a large dataset of historical purchase orders that contains personal data of their customers in the form of full names and telephone numbers. The dataset consists of 5 text files, 1TB each. Currently the dataset resides on-premises due to legal requirements of storing personal data in-country. The research and development department needs to run a clustering algorithm on the dataset and wants to use Elastic Map Reduce service in the closest AWS region. Due to geographic distance, the minimum latency between the on-premises system and the closet AWS region is 200 ms. Which option allows Company A to do clustering in the AWS Cloud and meet the legal requirement of maintaining personal data in-country? A. Anonymize the personal data portions of the dataset and transfer the data files into Amazon S3 in the AWS region. Have the EMR cluster read the dataset using EMRFS. B. Establish a Direct Connect link between the on-premises system and the AWS region to reduce latency. Have the EMR cluster read the data directly from the on-premises storage system over Direct Connect. C. Encrypt the data files according to encryption standards of Country X and store them on AWS region in Amazon S3. Have the EMR cluster read the dataset using EMRFS. D. Use AWS Import/Export Snowball device to securely transfer the data to the AWS region and copy the files onto an EBS volume. Have the EMR cluster read the dataset using EMRFS.  Suggested Answer: B This question is in BDS-C00 AWS Certified Big Data – Specialty Exam For getting AWS Certified Big Data – Specialty Certificate Disclaimers: The website is not related to, affiliated with, endorsed or authorized by Amazon. Trademarks, certification & product names are used for reference only and belong to Amazon. The website does not contain actual questions and answers from Amazon's Certification Exam.
Please login or Register to submit your answer