Hands-On Exercise Ideas - 15.8 | 15. Cloud Computing in Data Science (AWS,Azure, GCP) | Data Science Advance
K12 Students

Academics

AI-Powered learning for Grades 8–12, aligned with major Indian and international curricula.

Academics
Professionals

Professional Courses

Industry-relevant training in Business, Technology, and Design to help professionals and graduates upskill for real-world careers.

Professional Courses
Games

Interactive Games

Fun, engaging games to boost memory, math fluency, typing speed, and English skillsβ€”perfect for learners of all ages.

games

Interactive Audio Lesson

Listen to a student-teacher conversation explaining the topic in a relatable way.

Creating a Jupyter Notebook in SageMaker

Unlock Audio Lesson

Signup and Enroll to the course for listening the Audio Lesson

0:00
Teacher
Teacher

Today, we will start by discussing how to create a Jupyter Notebook in SageMaker. Can anyone tell me what SageMaker is used for?

Student 1
Student 1

I think SageMaker is used for building machine learning models.

Teacher
Teacher

Exactly! SageMaker provides tools for building, training, and deploying models. Now, let's focus on our exercise. First, what do we need to do to create a Jupyter Notebook?

Student 2
Student 2

We would need to access the AWS Management Console.

Teacher
Teacher

Right! You’ll navigate to SageMaker from there. Can anyone remember the steps to train a regression model once our notebook is set up?

Student 3
Student 3

We need to load our data, choose a model, and then fit the model with our training data.

Teacher
Teacher

Perfect! Remember the acronym K-F-M: Load your **K**eys (data), **F**ind your model, and **M**ake predictions. Let's summarize what we just discussed.

Teacher
Teacher

Today, we learned how to create a Jupyter Notebook in SageMaker and the steps needed to train a basic regression model through loading the dataset, selecting a model, and training it. Who feels ready to tackle this exercise?

Using BigQuery ML

Unlock Audio Lesson

Signup and Enroll to the course for listening the Audio Lesson

0:00
Teacher
Teacher

Next, let’s discuss how to utilize BigQuery ML for modeling. How do we start querying data?

Student 4
Student 4

I think we use SQL commands to get started with our datasets in BigQuery.

Teacher
Teacher

Correct! BigQuery allows us to run SQL queries directly against massive datasets. What is an example of a model we can create?

Student 1
Student 1

We could build a linear regression model using a public dataset.

Teacher
Teacher

Spot on! For our session, let's remember the mnemonic L-S-L: **L**inear regression, **S**QL commands, **L**arge datasets. Who can summarize how we can create a model in BigQuery?

Student 2
Student 2

We have to write our SQL queries to train models and then evaluate the results on our selected dataset.

Teacher
Teacher

Excellent summary! In today’s session, we explored BigQuery ML's capabilities, including starting with SQL queries and creating machine learning models.

Deploying Models with Azure ML Studio

Unlock Audio Lesson

Signup and Enroll to the course for listening the Audio Lesson

0:00
Teacher
Teacher

In this session, we're looking at Azure ML Studio and how we can deploy models as REST APIs. What does deploying a model entail?

Student 3
Student 3

It means making our machine learning model available to be used by other applications or services.

Teacher
Teacher

Exactly! Deployment is critical for applying our models in the real world. Can someone share how we might expose our model through Azure ML Studio?

Student 4
Student 4

We need to publish our model as a web service and then configure REST API settings.

Teacher
Teacher

Great! Remember the acronym P-W-C: **P**ublish to expose, **W**eb service to connect, and **C**onfigure settings. Let’s recap what we covered today.

Teacher
Teacher

Today, we dived into Azure ML Studio, learning how to deploy a model as a REST API, making it accessible for other applications. Is everyone ready to go hands-on?

Building Data Pipelines with GCP Dataflow

Unlock Audio Lesson

Signup and Enroll to the course for listening the Audio Lesson

0:00
Teacher
Teacher

Our final exercise covers building a pipeline with GCP Dataflow. Who can explain what Dataflow is used for?

Student 2
Student 2

Dataflow processes data streams and batch data to help with real-time analytics.

Teacher
Teacher

Exactly! It’s crucial for handling data that flows continuously. What are some essential components we must consider when building a pipeline?

Student 1
Student 1

We need to define our data processing logic and configure our sources and sinks.

Teacher
Teacher

Good point! Remember the mnemonic S-L-C: **S**ources, **L**ogic, and **C**onfiguration. Who can summarize what we'll achieve with this exercise?

Student 4
Student 4

We'll be able to set up a data pipeline that processes real-time data using Dataflow!

Teacher
Teacher

Perfect! Today, we learned how to construct a data pipeline with GCP Dataflow, covering essential components like source definition and processing logic. Are we excited about this project?

Introduction & Overview

Read a summary of the section's main ideas. Choose from Basic, Medium, or Detailed.

Quick Overview

This section outlines various hands-on exercises for implementing data science tasks on cloud platforms.

Standard

The section presents a range of practical exercises designed to help learners engage with cloud computing tools and develop data science models using AWS, Azure, and GCP. Each exercise aims to provide experience with cloud services in real-world applications.

Detailed

Hands-On Exercise Ideas

This section offers a curated list of hands-on exercises ideated for data science learners to engage deeply with cloud computing platformsβ€”AWS, Azure, and GCP. Each exercise is tailored to familiarize students with critical functionalities within these platforms, enhancing their practical skills and theoretical knowledge. The specific exercises include:

  1. Create a Jupyter Notebook in SageMaker to train a basic regression model: This exercise allows students to explore AWS’s SageMaker, where they will learn to set up notebooks and implement machine learning regression techniques.
  2. Use BigQuery ML to run SQL-based ML models on a public dataset: Engaging with BigQuery ML gives students practical experience in querying and analyzing large datasets with SQL while building machine learning models.
  3. Deploy a model using Azure ML Studio as a REST API: In this exercise, students learn how to deploy machine learning models effectively, converting them into scalable web services.
  4. Build a data pipeline using GCP Dataflow to process streaming data: Students will delve into real-time data processing through Dataflow, setting up a data pipeline that handles live data streams.

These exercises are constructed to be interactive and applicable, helping students to practice real-world data science challenges while leveraging cloud technologies.

Youtube Videos

4 Core Exercises You Should Do Everyday (Increase Core Strength!) #corestrength
4 Core Exercises You Should Do Everyday (Increase Core Strength!) #corestrength
Data Analytics vs Data Science
Data Analytics vs Data Science

Audio Book

Dive deep into the subject with an immersive audiobook experience.

Exercise 1: Create a Jupyter Notebook in SageMaker

Unlock Audio Book

Signup and Enroll to the course for listening the Audio Book

  1. Create a Jupyter Notebook in SageMaker and train a basic regression model.

Detailed Explanation

This exercise involves using Amazon SageMaker, a cloud machine learning platform, to create a Jupyter Notebook. A Jupyter Notebook is an interactive environment where you can write and execute code. In this case, you will be training a basic regression model, which is a type of machine learning model that predicts a continuous output based on input features. This process includes loading data, exploring it, selecting a suitable algorithm for regression, and training the model on your dataset.

Examples & Analogies

Think of creating a Jupyter Notebook like setting up a kitchen to bake a cake. You gather your ingredients (data), follow a recipe (regression algorithm), and then bake it in the oven (train the model). Once it's finished, you can taste (evaluate) how well your cake turned out!

Exercise 2: Using BigQuery ML

Unlock Audio Book

Signup and Enroll to the course for listening the Audio Book

  1. Use BigQuery ML to run SQL-based ML models on a public dataset.

Detailed Explanation

In this exercise, you'll utilize Google Cloud's BigQuery ML, which allows users to create and execute machine learning models using SQL queries. This means you don’t need to be proficient in programming languages like Python or R to apply machine learning techniques. You will select a public dataset available in BigQuery, write SQL queries to preprocess the data, and then build and evaluate a machine learning model directly in the database environment.

Examples & Analogies

Imagine using a recipe book to cook a meal. In this case, your recipe book is SQL, and it guides you to combine ingredients (data) in ways that result in a dish (model) that you can enjoy. Just like cooking, the right instructions lead to the best outcomes!

Exercise 3: Deploying a Model with Azure ML Studio

Unlock Audio Book

Signup and Enroll to the course for listening the Audio Book

  1. Deploy a model using Azure ML Studio and expose it as a REST API.

Detailed Explanation

This exercise focuses on using Microsoft Azure's machine learning platform, Azure ML Studio, for deploying a machine learning model. After building your model, you will publish it as a REST API, which allows other applications to interact with it over the web. This process includes configuring the deployment settings and testing the API to ensure it's working correctly for inference (making predictions).

Examples & Analogies

Think of this like setting up a new coffee shop (the API) after creating a unique coffee blend (the model). Once everything is set up, people can come in and order your coffee (make predictions) from anywhere, thanks to your coffee shop's location in the city (the REST API).

Exercise 4: Building a Data Pipeline with GCP Dataflow

Unlock Audio Book

Signup and Enroll to the course for listening the Audio Book

  1. Build a data pipeline using GCP Dataflow to process streaming data.

Detailed Explanation

In this exercise, you'll work with Google Cloud's Dataflow, which is a fully managed service for stream and batch data processing. You'll focus on building a data pipeline that ingests, processes, and outputs streaming data in real-time. This includes defining data transformations, handling data integration, and ensuring that data is stored properly for later analysis. It's an essential skill to manage continuous flow data, common in today's real-time applications.

Examples & Analogies

Consider this exercise as setting up an automated assembly line in a factory. Just like the assembly line processes items as they move through each stage, your data pipeline processes data streams as they flow, ensuring everything gets sorted and packaged by the end for distribution (analysis).

Definitions & Key Concepts

Learn essential terms and foundational ideas that form the basis of the topic.

Key Concepts

  • Hands-On Learning: Practical exercises enhance understanding of cloud computing tools.

  • AWS SageMaker: A powerful tool for creating and managing machine learning models.

  • BigQuery ML: Google Cloud’s SQL-based machine learning solution for data analytics.

  • Azure ML Studio: Platform for deploying machine learning models as REST APIs.

  • Dataflow: Google Cloud’s solution for real-time data processing.

Examples & Real-Life Applications

See how the concepts apply in real-world scenarios to understand their practical implications.

Examples

  • Creating a Jupyter Notebook in AWS SageMaker to train a linear regression model using a sample dataset.

  • Utilizing BigQuery ML to implement a machine learning model that predicts house prices based on SQL queries.

Memory Aids

Use mnemonics, acronyms, or visual cues to help remember key information more easily.

🎡 Rhymes Time

  • In SageMaker we create the first step, Load up the data and take a prep!

πŸ“– Fascinating Stories

  • Imagine you’re a data scientist on a mission. You dive into AWS SageMaker like it’s a deep sea expedition, loading data like fishing in the sea, training models, as easy as can be!

🧠 Other Memory Gems

  • Remember K-F-M for SageMaker: Keys, Find the model, Make predictions.

🎯 Super Acronyms

P-W-C for deploying in Azure

  • **P**ublish
  • **W**eb service
  • **C**onfigure.

Flash Cards

Review key concepts with flashcards.

Glossary of Terms

Review the Definitions for terms.

  • Term: Jupyter Notebook

    Definition:

    An open-source web application that allows you to create and share documents containing live code, equations, visualizations, and narrative text.

  • Term: AWS SageMaker

    Definition:

    A fully managed service that provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly.

  • Term: BigQuery ML

    Definition:

    A feature of Google BigQuery that enables users to run machine learning models using SQL syntax.

  • Term: REST API

    Definition:

    Representational State Transfer Application Programming Interface, a method of allowing different applications to communicate over the internet.

  • Term: GCP Dataflow

    Definition:

    A fully managed service for processing real-time data or batch data using stream or batch processing.