У нас вы можете посмотреть бесплатно From Data to Business Insights: PySpark on Databricks for Amazon Prime Dataset Analysis 📊🚀 или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Welcome to DataToCrunch! 🚀 In this tutorial, we dive deep into Databricks and PySpark to analyze the Amazon Prime Movies and TV Shows dataset. 📊 Whether you're a beginner or an experienced data enthusiast, this step-by-step guide will walk you through: 1. Setting up Databricks and ingesting data. ⚙️ 2. Performing data cleaning and transformation for actionable insights. 🔄 3. Conducting exploratory data analysis (EDA) to uncover trends and patterns. 🔍 4. Deriving key business insights, including content type distribution, genre popularity, top-rated content, and global availability. 🌍 5. Leveraging advanced analytics like window functions to rank content by ratings. 📈 6. Exporting processed data to various formats for sharing and further use. 💾 By the end of this video, you'll understand how to use PySpark in Databricks for real-world business analysis and decision-making. 💡 Time Stamp - 00:00:00 - Intro 00:00:32 - Agenda 00:02:12 - Dataset Understanding 00:04:12 - A] Create Compute 00:05:04 - B] Data Ingestion 00:05:39 - C] Create Folder & Notebook 00:06:51 - D] Practical Implementation begins - Part 1 : Dataset Overview 00:07:07 - 1) List the contents in DBFS 00:09:06 - 2) Read .CSV File 00:12:38 - 3) Read Schema 00:13:25 - Part 2: Data Cleaning 00:14:26 - 4) Replace Null Values with Defaults 00:17:19 - 5) Remove rows with any NULL values 00:18:44 - 6) Remove Duplicate Rows 00:20:09 - 7) Rename Column Name 00:22:04 - Part 3 - Data Transformation 00:22:46 - 8) Create New Column & Transform Existing Data 00:27:43 - 9) Split the Column & Count the Occurrences 00:35:05 - Part 4: Exploratory Data Analysis (EDA) 00:35:30 - 10) Content Type Distribution (GroupBy , OrderBy) 00:37:39 - 11) Content Production By Year (GroupBy , OrderBy, Count) 00:40:12 - Part 5: Business Insights 00:40:48 - 12) Top Rated Content (Filter, OrderBy) 00:42:25 - 13) Content Availability By Country (GroupBy, Orderby, Count) 00:44:05 - Part 6: Advanced Analytics 00:44:29 - 12) Rank the Content by Rating for Each Year (WINDOWS) 00:50:30 - Part 7: Data Export 00:50:59 - 13) Data Export to Delta Format (Format, Mode) 00:53:00 - 14) Data Export to CSV Format (Format, Mode, Concat_ws) 00:58:11 - E) Conclusion for Business Analysis 00:59:43 - F) Strategic Recommendations For Business 01:00:30 - G] Conclusion 🛠 Resources: 1. Dataset link: https://www.kaggle.com/datasets/shiva... 2. Introduction to Apache Spark | Databricks (Theory) - Part 1 : • Introduction to Apache Spark | Databricks ... 3. Spark & Databricks - Spark Architecture |Memory Management |Application Workflow (Theory) - Part 2 : • Spark & Databricks - Spark Architecture |M... 4. Spark & Databricks: RDDs| DataFrames| Datasets| Spark Ecosystem| RDD Operations (Theory) - Part 3 : • Spark & Databricks: RDDs| DataFrames| Data... 5. Apache Spark & Databricks: Lazy Evaluation| Fault Tolerance| DAG|Catalyst Optimizer(Theory) - Part 4 : • Apache Spark & Databricks: Lazy Evaluation... 6. Databricks Journey Begins: Compute, Catalog, Workflows, Data Management, and More! : • Databricks Journey Basic Concepts : Comput... #Databricks #PySpark #DataAnalysis #AmazonPrime #EDA #BigData #DataScience #DataEngineering #BusinessInsights #Spark #DataTransformation #AmazonPrimeAnalysis #DataVisualization #TechTutorials #MachineLearning #ArtificialIntelligence #DataCrunching #KaggleDataset #Analytics #DataDriven #StreamingData #apachespark #databricksforbeginners #databrickstutorial #databricksai 📊 Don't forget to like 👍, share 📤, and subscribe 🔔 to DataToCrunch for more business-focused data tutorials. Let’s crunch some data together! 🤖