У нас вы можете посмотреть бесплатно Hands-On Guide: Build Your First AWS Data Lake with Glue & S3 или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
If you liked this video & my teaching style & wish to join my crash course live training then fill the form below: https://forms.gle/aAFLAg2u4TokxsB19 I keep small batches and group students with similar aspirations and current knowledge level. If you have more than 5 years of data engineering experience & wish to learn AWS then this crash course is just for you. Fill the form mentioned above & I will setup a introduction call soon. AWS Data Engineering crash course covering Amazon Redshift, AWS Glue, Amazon EMR & Airflow to run end to end pipeline : • End-to-End ETL Pipeline in AWS: Redshift, ... In this video I have explained how you can use AWS Glue with Apache Hudi format to build data lake. This is a beginner video and intentionally I have kept it simple for understanding purpose. In this video I have given overview of AWS Glue , Apache Hudi and a demo to build a SCD-1 type dimension table. How you can run Update & Insert on top of Hudi table. Video Timeline: 00:00 Introduction 00:35 What is AWS Glue 00:50 Glue Data Catalog 01:25 Glue Crawlers 02:25 Glue Studio visual builder 02:45 Apache Hudi 04:12 My intention of making this video 05:05 AWS Glue - serverless service 05:51 AWS Glue - Spark ETL 05:57 AWS Glue - Python Jobs 06:10 AWS Glue vs AWS Lambda for python jobs 06:46 AWS Glue Data Catalog 07:01 What is Metadata 07:20 Metadata - business, technical, operational 08:10 Glue Data Catalog - why is it so powerful 08:40 AWS Glue crawlers for automated data discovery 09:50 do I use glue crawler a lot ? 10:20 Glue Studio visual builder 10:42 do I use visual builder a lot ? 12:14 Apache Hudi - open source data lake format 12:58 Datalake vs Data warehouse 14:22 Hudi ACID 14:44 Hudi versioning 14:52 Hudi integration with Glue, Athena, Redshift 15:37 Revisit the concepts before Demo 15:50 Demo (2 input files) 16:25 Demo - create Glue job 20:04 Demo - save glue job and run it 20:18 Demo - Glue job input arguments 21:02 Demo - Glue job script walkthrough 22:42 Job complete , check table data 23:41 Demo - run second file for UPSERT (SCD-1) 25:38 Demo - Glue continuous driver logs 26:24 Demo - Hive style partitioning in Hudi 27:01 Demo -2nd run complete, check data 27:18 Demo - end of demo Will you be interested in AWS data engineering session with me ? If you wish to download the presentation slides , sample data files & source code for AWS Glue job , Amazon EMR pyspark application , Amazon Redshift sql script & Managed Airflow DAG code used in the crash course video then check the link below: https://mailchi.mp/45b9673b727b/aws-d...