У нас вы можете посмотреть бесплатно Dask Bag in 8 Minutes: An Introduction или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
In this video, Matt Rocklin gives a brief introduction to Dask Bags. Dask is a free and open-source library for parallel computing in Python. Dask is a community project maintained by developers and organizations. Dask Bag implements operations like map, filter, fold, and groupby on collections of generic Python objects. It does this in parallel with a small memory footprint using Python iterators. It is similar to a parallel version of PyToolz or a Pythonic version of the PySpark RDD. Dask Bags coordinate many Python lists or Iterators, each of which forms a partition of a larger collection. Dask Bags are often used to parallelize simple computations on unstructured or semi-structured data like text data, log files, JSON records, or user defined Python objects. Execution on bags provide two benefits: Parallel: data is split up, allowing multiple cores or machines to execute in parallel Iterating: data processes lazily, allowing smooth execution of larger-than-memory data, even on a single machine within a single partition Share your feedback with us in the comments and let us know: Did you find the video helpful? What do you think of Dask Bags? Learn more at https://docs.dask.org/en/latest/bag.html KEY MOMENTS 00:00:00 Intro 00:00:24 Dask Bag Example 00:02:05 Example Using JSON 00:03:30 Analyze Data with map, filter, and reductions 00:06:49 Convert to Dask Dataframe