У нас вы можете посмотреть бесплатно MLeap and Combust ML (Hollin Wilkins and Mikhail Semeniuk) или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Data Scientists use myriad tools, analyze datasets, clean them and build offline models and validate their performance ad hoc. The resulting scripts are thrown across the wall to Data Engineers and Architects whose job it is to productionize these workflows. The Engineers are left with the unenviable job of not only reproducing the Data Scientists’ conclusions, but to scale the resulting pipeline both of which require a deep understanding of Data Science itself. As a result, most if not all Data Science deployments in the wild end up either too simplistic or take too long to productionize. There are a variety of challenges in productionizing data science workflows, some of which are solved by Spark itself. However, there is still a large gap that needs to be plugged: How do you take workflows that have been trained offline and produce models that have to be scored online? MLeap is an open source Spark package designed to serialize your Spark-trained pipelines and transformers, deploy them to a JVM-based API server and execute real-time, one-off requests. In this talk we motivate the need for such a library, outline programming time saved by using MLeap, show benchmarks of several online models and provide a demo as well as examples of how to use MLeap in practice. In addition, we present a platform called Combust.ML that can be used to deploy Spark trained algorithms to highly scalable, scala-backed API servers.