У нас вы можете посмотреть бесплатно Accelerating Apache Parquet with metadata stores and specialized indexes using Apache DataFusion или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
It is a common misconception that querying Apache Parquet data is constrained to the basic metadata built into the format itself and thus is slower than querying proprietary formats. Parquet does contain standard Min/Max metadata, "Page Index" and Bloom filters, and using open source composable systems such as Apache DataFusion, it is possible to build sophisticated caches and specialized system specific indexes while retaining broad ecosystem compatibility. In this talk I review the structures built into Parquet for query acceleration, and demonstrate how to use a cache for parsed metadata, push row group and page pruning into a metadata store, and build a specialized index for multi-column primary keys. Speaker Bio: Andrew Lamb is a Staff Engineer at InfluxData, working in Rust on InfluxDB 3.0, focused on query processing, the Apache DataFusion query engine and the Apache Arrow ecosystem. He serves on the Apache DataFusion PMC (Current Chair), and on the Apache Arrow PMC, and actively contributes to DataFusion and the Arrow Rust implementations. He earned a BS and MEng in Course VI from MIT. More details are available at http://andrew.nerdnetworks.org/ Presentation Slides: https://docs.google.com/presentation/... Links to examples I refer to in the video https://github.com/apache/datafusion/... https://github.com/apache/datafusion/...