Lab: EMR Serverless

Creating and submitting Word count Spark Job in EMR Serverless

Objective

Data Transformation with PySpark using Amazon EMR Serverless Application

Introduction

By the end of the lab, you will be able to:

Create Amazon EMR Serverless Application for Spark
Submit Spark jobs to EMR Serverless Application
Create Amazon EMR Serverless Application for Hive
Submit Hive jobs to EMR Serverless Application
Use Spark UI for monitoring and debugging

Files

├── [ 503]  README.md
├── [7.1K]  assets
│   └── [7.0K]  mwaa_plugin.zip
├── [ 22K]  cfn
│   ├── [4.7K]  emr-serverless-cfn-v2.json
│   └── [ 17K]  mwaa_emr.yml
├── [ 26K]  main.ipynb
├── [ 52K]  orchestration
│   ├── [ 49K]  airflow.cfg
│   ├── [1.1K]  dags
│   │   └── [1004]  example_emr_serverless.py
│   ├── [ 589]  init.sh
│   ├── [ 120]  requirments.txt
│   └── [ 754]  scripts
│       └── [ 658]  pi.py
└── [ 20K]  src
    ├── [ 105]  count.sql
    ├── [ 778]  create_taxi_trip.sql
    ├── [2.3K]  example_emr_serverless.py
    ├── [3.7K]  hudi-cow.py
    ├── [3.8K]  hudi-mor.py
    ├── [3.7K]  hudi-upsert-cow.py
    ├── [3.8K]  hudi-upsert-mor.py
    └── [1.1K]  wordcount.py

 128K used in 6 directories, 18 files

Lab: EMR Serverless

Objective​

Introduction​

Files​

Notebook​

Objective

Introduction

Files

Notebook