AWS BDS-C00 Certified Big Data Speciality Practice Test Set 2

Which of the following commands can be used to transfer the results of a query in Red shift to Amazon 53?


Options are :

  • COPY
  • EXPORT
  • UNLOAD (Correct)
  • Dist Cp

Answer : UNLOAD

When estimating the cost of using EMR, which of the following parameters should you consider. Choose 3 answers from the options given below ?


Options are :

  • The price of the underlying VPC
  • The price of the underlying EC2 Instances (Correct)
  • The price of the EMR service . (Correct)
  • The price of EBS storage if used. (Correct)

Answer : The price of the underlying EC2 Instances The price of the EMR service . The price of EBS storage if used.

Which of the below is a fully managed service that can be used to deliver real-time streaming data to 53. Please select?


Options are :

  • Kinesis Streams
  • EMR
  • Kinesis Fire hose (Correct)
  • Redshift

Answer : Kinesis Fire hose

There is a requirement for a vendor to have access to an 53 bucket in your account. The vendor already has an AWS account. How can you provide access to the vendor on this bucket?


Options are :

  • Create an S3 bucket policy that allows the vendor to read from the bucket from their AWS account. (Correct)
  • Create a new lAM user and grant the relevant access to the vendor on that bucket
  • Create a new lAM group and grant the relevant access to the vendor on that bucket.
  • Create a cross-account role for the vendor account and grant that role access to the S3 bucket.

Answer : Create an S3 bucket policy that allows the vendor to read from the bucket from their AWS account.

Which of the following refers to a single action on multiple items in a Kinesis stream? Please select:


Options are :

  • Compression
  • Batching (Correct)
  • Collection
  • Aggregation

Answer : Batching

Which of the following file formats are supported in Athena by default? Please select:


Options are :

  • CSV
  • JSON
  • Apache Parquet (Correct)
  • Adobe Acrobat

Answer : Apache Parquet

You are AWS big data engineer and have setup a testing Red shift cluster in your AWS development account in us-east-i. Now your company decide to move the cluster to the production account in us-west-i. Which o the below Is the first step you would carry out in the entire process. Please setect:


Options are :

  • Configure an IAM roles for the transfer
  • Enable cross-region snapshot copy
  • Create a new Red shift cluster in the target region
  • Create a manual snapshot of the Red shift cluster (Correct)

Answer : Create a manual snapshot of the Red shift cluster

What are the 2 types of nodes in a Red shift Cluster? Choose 2 answers from the options given below ?


Options are :

  • Leader Node (Correct)
  • Compute Node (Correct)
  • Task node
  • Master node

Answer : Leader Node Compute Node

You are trying to use SQL Client tool from an EC2 Instance, but you are not able to connect to the Red shift Cluster. What must you do to ensure that you are able to connect to the Red shift Cluster from the EC2 Instance?


Options are :

  • Modify the VPC Security Groups (Correct)
  • Modify the NACL on the subnet
  • Attaches proper IAM role for Red shift access to the EC2 instance
  • Use the AWS CLI instead of the Red shift client tools

Answer : Modify the VPC Security Groups

A Company has created an e-commerce site using Dynamo DB and is designing a product table that includes items purchased and the users who purchased the item. When creating a primary key on a table which of the following would be the best attribute for the primary key? Select the BEST possible answer?


Options are :

  • User jd where there are many users to few products (Correct)
  • None of the above
  • Product _id where there are few products to many users
  • Category jd where there are few categories to many products

Answer : User jd where there are many users to few products

I/to do: add diagram or example for Cloud Watch Elastic search Log solution; revise q to become use case You need centralized solution to search through the logs stored In the Cloud Watch logs service. Which of the following services can be used for this purpose?


Options are :

  • Elastic search (Correct)
  • Cloud trail
  • Elastic Cache
  • Elastic Map Reduce

Answer : Elastic search

You have applications hosted in a fleet of AWS EC2 Instances. Applications new features are released frequently while the application high availability should be maintained. To fulfil this requirement the logs from the EC2 Instances need to be published and analyzed instantly. If there are any errors or any anomalous behavior, then the new deployed application is rolled back with minimum services interruption. Which of the following methods should you use for publishing and analyzing the logs?


Options are :

  • Send the logs from the EC2 Instances to AWS Cloud watch Logs
  • Send the logs from the EC2 Instances to AWS Kinesis
  • Use consumers to analyze the logs (Correct)
  • Use AWS EMR to analyze the logs

Answer : Use consumers to analyze the logs

You need to use the graphing tools available in Amazon Quick sight. Which of the following would you use for comparing measure values over time? Please select:


Options are :

  • PivotTable
  • Scatter Plot
  • Line charts (Correct)
  • Bar Charts

Answer : Line charts

An application is currently writing a large number of records to a Dynamo DB table in one region. There is a requirement for a secondary application to just take in the changes to the Dynamo DB table every 2 hours and process the updates accordingly. Which of the following is an ideal way to ensure the secondary application can get the relevant changes from the Dynamo DB table.


Options are :

  • Create another Dynamo DB table with the records modified in the last 2 hours
  • Use Dynamo DB streams to monitor the changes in the Dynamo DB table. (Correct)
  • Insert a timestamp for each record and then scan the entire table for the timestamp as per the last 2 hours.
  • Transfer the records to 53 which were modified In the last 2 hours

Answer : Use Dynamo DB streams to monitor the changes in the Dynamo DB table.

A company has a Red shift cluster for peta byte-scale data warehousing. The data within the cluster is easily reproducible from additional data stored on Amazon S3. The company wants to reduce the overall total cost of running this Red shift cluster. Which scenario would best meet the needs of the running cluster, while still reducing total overall ownership cost of the cluster?


Options are :

  • Instead of implementing automatic daily backups, write a CLI script that creates manual snapshots every f days. Copy the manual snapshot to a secondary AWS region for disaster recovery situations.
  • Disable automated and manual snapshots on the cluster (Correct)
  • Implement daily backups. but do not enable multi-region copy to save data transfer costs.
  • Enable automated snapshots but set the retention period to a lower number to reduce storage costs

Answer : Disable automated and manual snapshots on the cluster

You are working with a Kinesis Stream. What is used to group data by shard within a stream? Please select:


Options are :

  • Hash Key
  • Partition Key (Correct)
  • Record Id
  • Sequence Number

Answer : Partition Key

You have created a Dynamo DB table for an application that needs to support thousands of users. You need to ensure that each user can only access their own data in a particular table. Many users already have accounts with a third-party identity provider, such as Face book. Google. or Login with Amazon. How would you implement this requirement? Choose 2 answers from the options given below Please select:


Options are :

  • Create an IAM User for all users so that they can access the application
  • Use a third-party identity provider such as Google
  • Use Web identity federation and register your application with a third-party identity provider such as Google, Amazon, or Face book. (Correct)
  • Create an IAM role which has specific access to the Dynamo DB table.

Answer : Use Web identity federation and register your application with a third-party identity provider such as Google, Amazon, or Face book.

When troubleshooting slowness on an EMR cluster, which of the following node types does not need to be investigated for issues. Please select:


Options are :

  • Task Nodes
  • Master Node
  • Leader Node
  • Core Nodes (Correct)

Answer : Core Nodes

Which of the following is the default input data format for Amazon EMR?


Options are :

  • Text (Correct)
  • XML
  • HTML
  • JSON

Answer : Text

You are planning on loading a 500 GB file into a Red shift cluster which has 10 nodes. Which of the following is a preferable method to load the data?


Options are :

  • Split the file Into 500 smaller files (Correct)
  • Split the file into 50 files of equal size
  • Compress the file using gz compression.
  • Split the file into 10 files of equal size

Answer : Split the file Into 500 smaller files

You are planning on loading a huge amount of data into a Red shift Cluster. You are not sure if the load will succeed or fail. Which of the below options can help see if an error would occur during the load process.


Options are :

  • Use the COPY Command
  • Use the IMPORT Command
  • Use the COPY command with the NOLOAD option (Correct)
  • Use the IMPORT command with the NOLOAD option

Answer : Use the COPY command with the NOLOAD option

You have two teams using Red shift to analyze data of a massive application, each query issued by the first team takes approximately 1 -2 hours to analyze the data while other team takes very short time to analyze the data. You don?t want the second team?s queries to wait until the already running long queries are completed. How will you solve the problem in most economical way? Choose an answer from the options below?


Options are :

  • Start another Red shift cluster from snapshot for the second team if current Red shift cluster Is busy processing long queries
  • Create two separate workload management groups and assign them to respective teams (Correct)
  • Create a read replica of the Red shift instance and run second team?s queries on read replica
  • Pause long queries and resume the queries afterwards

Answer : Create two separate workload management groups and assign them to respective teams

Which of the following server less service can be used to run ad-hoc queries for data in 53


Options are :

  • Athena (Correct)
  • SQS
  • Red shift
  • AWS EMR

Answer : Athena

Which of the following statements about Red shift database encryption is false?


Options are :

  • Encryption is enable by default for a cluster (Correct)
  • When encryption is enabled, it Is also enabled for the snapshots created
  • When encryption Is enabled, the system metadata Is also encrypted
  • In Amazon Red shift you can enable database encryption for your clusters to help protect data at rest

Answer : Encryption is enable by default for a cluster

You need a cost-effective solution to store a large collection of video files and have a data warehouse that can keep a track of the data. Which of the following would form the solution required to fulfill the requirements? Choose 2 answers from the options given below. Each answer forms part of the solution Please select:


Options are :

  • Store the collection of video files in S3 (Correct)
  • Store the collection of video files in AWS Red shift
  • Store the reference location for the video files in AWS Red shift
  • Store the reference location for the video files in AWS Kinesis

Answer : Store the collection of video files in S3

Your company has a web site hosted in AWS. There is a requirement to analyze the click stream data for the web site and this needs to be done in real time. Which of the below can be used to fulfill this requirement.


Options are :

  • Use the Amazon Kinesis service to process the data from a Kinesis agent (Correct)
  • Use the Red shift service to ingest the data so that it can be analyzed at a later point in time.
  • Use the EMR service to ingest the data so that it can be analyzed at a later point in time.
  • Publish the web clicks to Amazon SQS

Answer : Use the Amazon Kinesis service to process the data from a Kinesis agent

Which of the following can be used for storing log files generated from an EMR cluster? Please select:


Options are :

  • Amazon 53 (Correct)
  • Amazon Glacier
  • Amazon Cloud trail
  • Amazon Dynamo DB

Answer : Amazon 53

of the following can be used along with Amazon EMR to perform SQL like queries on the data stored in EMR?


Options are :

  • Red shift
  • Kinesis
  • SQS
  • Hive (Correct)

Answer : Hive

Your company has an Oracle database installed on an EC2 instance. TDE is enabled for the Oracle database. The company needs a solution to store TDE master encryption key for providing additional security. Which of the following is a managed service that can full fill this requirement?


Options are :

  • AWS CIoud HSM (Correct)
  • AWS KMS
  • S3-KMS
  • On-premise HSM

Answer : AWS CIoud HSM

Which of the following is a self-serviced tool which can be used to build real-time applications using streaming data?


Options are :

  • Kinesis Fire hose
  • Kafka (Correct)
  • SQS
  • Kinesis

Answer : Kafka

Comment / Suggestion Section
Point our Mistakes and Post Your Suggestions