У нас вы можете посмотреть бесплатно SparkOscope: Enabling Apache Spark Optimization through Cross Stack Monitoring - Yiannis Gkoufas или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
"During the last year, the team at IBM Research at Ireland has been using Apache Spark to perform analytics on large volumes of sensor data. These applications need to be executed on a daily basis, therefore, it was essential for them to understand Spark resource utilization. They found it cumbersome to manually consume and efficiently inspect the CSV files for the metrics generated at the Spark worker nodes. Although using an external monitoring system like Ganglia would automate this process, they were still plagued with the inability to derive temporal associations between system-level metrics (e.g. CPU utilization) and job-level metrics (e.g. job or stage ID) as reported by Spark. For instance, they were not able to trace back the root cause of a peak in HDFS Reads or CPU usage to the code in their Spark application causing the bottleneck. To overcome these limitations, they developed SparkOScope. Taking advantage of the job-level information available through the existing Spark Web UI and to minimize source-code pollution, they use the existing Spark Web UI to monitor and visualize job-level metrics of a Spark application (e.g. completion time). More importantly, they extend the Web UI with a palette of system-level metrics of the server/VM/container that each of the Spark job’s executor ran on. Using SparkOScope, you can navigate to any completed application and identify application-logic bottlenecks by inspecting the various plots providing in-depth timeseries for all relevant system-level metrics related to the Spark executors, while also easily associating them with stages, jobs and even source code lines incurring the bottleneck. They have made Sparkoscope available as a standalone module, and also extended the available Sinks (mongodb, mysql). Session hashtag: #SFdev16" About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business. Read more here: https://databricks.com/product/unifie... Connect with us: Website: https://databricks.com Facebook: / databricksinc Twitter: / databricks LinkedIn: / databricks Instagram: / databricksinc Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-nam...