Big Data on AWS
6 Labs · 75 Credits · 6h 23mUse Case (Experienced)
This quest is designed to teach you how to work with AWS services to perform big data analytics on the cloud.
[IMPORTANT: This lab requires you to use or create a Twitter account and application and use it's credentials in the lab in order to pull data into Amazon DynamoDB. Please review the Lab Guide for Twitter account instructions before starting this lab.] This lab introduces Amazon DynamoDB and walks you through basic operations such as creating, updating, querying, and deleting tables in Amazon DynamoDB. It will also show you how to change the provisioned throughput of the tables and see how that is reflected in the application. Note: This lab may take 10-11 minutes to setup and start. Please start to work on the first two exercises while the lab is building.
advanced 10 Credits 45 Minutes
The lab demonstrates how to use Amazon RedShift to create a cluster, load data, run queries and monitor performance. Note: Students will download a free SQL client as part of this lab.
advanced 10 Credits 45 Minutes
This lab will provide an introduction to running a Geospatial data server on AWS infrastructure. For this lab we will leverage the GeoServer product. GeoServer is a Java based, open source software server that allows users to view and edit geospatial data. It leverages open standards from the Open Geospatial Consortium (OGC) to facilitate flexible data sharing of geospatial information. The lab leads you through the steps to launch and configure an Ubuntu Linux virtual machine in the Amazon cloud. You will install GeoServer on this instance and load a dataset into to server. Prerequisites: To successfully complete this lab, you should be familiar with basic Linux server administration and comfortable using the Linux command-line tools. Some familiarity with database fundamentals and geospatial tools would be an advantage.
advanced 10 Credits 50 Minutes
This lab demonstrates how to launch an Amazon Elastic MapReduce (EMR) cluster for Big Data processing and use Hive with SQL-style queries to analyze data. You will create a Hadoop cluster using Amazon EMR which will allow to run interactive Hive queries against data stored in Amazon S3. You will use Hive to normalize the data in a more useful way, and you will run queries to analyze the data.
expert 15 Credits 1 Hour
This lab demonstrates how to create a variety of Analytic Dashboards which are continuously updated via the Amazon Kinesis Aggregators framework. You will learn how to create Amazon Kinesis Streams, how to create real-time aggregated datasets with Amazon Kinesis Aggregators and learn how to interact with this data using Amazon CloudWatch and custom dashboarding tools.
expert 15 Credits 55 Minutes
In this lab, you will build a smart solution using Amazon Redshift and Amazon Machine Learning that predicts delays for flights originating in Chicago’s O’Hare international airport. You will learn how to analyze large amounts of data using Redshift. Then you will practice using Machine Learning to create a model that will predict flight delays. Prerequisites: To successfully complete this lab, you should be familiar with Redshift concepts by taking the introductory lab at qwiklabs.com. Some knowledge of SQL and Python programming is required, although full solution code is provided. You should be comfortable using RDP to connect to a Windows server and using SQL client software. You should have at a minimum taken the “Introduction to Amazon Redshift” and “Introduction to Machine Learning” labs at qwiklabs.com. Note: this lab must run (currently) in us-east-1 for the Machine Learning service. Be sure to check in the AWS console that you are running in us-east-1 (N. Virginia) and change to us-east-1 if necessary.
expert 15 Credits 1 Hour 45 Minutes