У нас вы можете посмотреть бесплатно Using Apache Spark for Processing Trillions of Records Each Day | Datadog или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Download the slides for this talk: https://goo.gl/QaXR7k Massively scaling Apache Spark can be challenging, but it’s not impossible. In this session we’ll share Datadog’s path to successfully scaling Spark and the pitfalls we encountered along the way. We’ll discuss some low-level features of Spark, Scala, JVM, and the optimizations we had to make in order to scale our pipeline to handle trillions of records every day. We’ll also talk about some of the unexpected behaviors of Spark regarding fault-tolerance and recovery—including the ExternalShuffleService, recomputing partitions, and Shuffle Fetch failures—which can complicate your scaling efforts. ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups. FOLLOW DATA COUNCIL: Twitter: / datacouncilai LinkedIn: / datacouncil-ai Facebook: / datacouncilai Eventbrite: https://www.eventbrite.com/o/data-cou...