With Amazon Elastic MapReduce (Amazon EMR) you can analyze and process vast amounts of data. The cluster is managed using an open-source framework called Hadoop. You have set up an application to run Hadoop jobs. The application reads data from DynamoDB and generates a temporary file of 100 TBs. The whole process runs for 30 minutes and the output of the job is stored to S3. Which of the below…

QuestionsCategory: SAP-C01With Amazon Elastic MapReduce (Amazon EMR) you can analyze and process vast amounts of data. The cluster is managed using an open-source framework called Hadoop. You have set up an application to run Hadoop jobs. The application reads data from DynamoDB and generates a temporary file of 100 TBs. The whole process runs for 30 minutes and the output of the job is stored to S3. Which of the below…
Admin Staff asked 4 months ago
With Amazon Elastic MapReduce (Amazon EMR) you can analyze and process vast amounts of data. The cluster is managed using an open-source framework called Hadoop. You have set up an application to run Hadoop jobs. The application reads data from DynamoDB and generates a temporary file of 100 TBs.
The whole process runs for 30 minutes and the output of the job is stored to S3.
Which of the below mentioned options is the most cost effective solution in this case?

A. Use Spot Instances to run Hadoop jobs and configure them with EBS volumes for persistent data storage.

B. Use Spot Instances to run Hadoop jobs and configure them with ethereal storage for output file storage.

C. Use an on demand instance to run Hadoop jobs and configure them with EBS volumes for persistent storage.

D. Use an on demand instance to run Hadoop jobs and configure them with ephemeral storage for output file storage.








 

Suggested Answer: B



AWS EC2 Spot Instances allow the user to quote his own price for the EC2 computing capacity. The user can simply bid on the spare Amazon EC2 instances and run them whenever his bid exceeds the current Spot Price. The Spot Instance pricing model complements the On-Demand and Reserved Instance pricing models, providing potentially the most cost-effective option for obtaining compute capacity, depending on the application. The only challenge with a Spot Instance is data persistence as the instance can be terminated whenever the spot price exceeds the bid price. In the current scenario a Hadoop job is a temporary job and does not run for a longer period. It fetches data from a persistent DynamoDB. Thus, even if the instance gets terminated there will be no data loss and the job can be re- run. As the output files are large temporary files, it will be useful to store data on ethereal storage for cost savings.
Reference:
http://aws.amazon.com/ec2/purchasing-options/spot-instances/


This question is in SAP-C01 AWS Certified Solutions Architect – Professional Exam
For getting AWS Certified Solutions Architect – Professional Certificate



Disclaimers:
The website is not related to, affiliated with, endorsed or authorized by Amazon.
Trademarks, certification & product names are used for reference only and belong to Amazon.
The website does not contain actual questions and answers from Amazon's Certification Exam.


Question Tags:

Recommended

Welcome Back!

Login to your account below

Create New Account!

Fill the forms below to register

Retrieve your password

Please enter your username or email address to reset your password.