AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms | Nicusan | JuliaCon Global 2025 скачать в хорошем качестве

AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms | Nicusan | JuliaCon Global 2025 5 дней назад

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Скачать видео с ютуб по ссылке или смотреть без блокировок на сайте: AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms | Nicusan | JuliaCon Global 2025 в качестве 4k

У нас вы можете посмотреть бесплатно AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms | Nicusan | JuliaCon Global 2025 или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:

Информация по загрузке:

Скачать mp3 с ютуба отдельным файлом. Бесплатный рингтон AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms | Nicusan | JuliaCon Global 2025 в формате MP3:

Если кнопки скачивания не загрузились НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу страницы.
Спасибо за использование сервиса ClipSaver.ru

AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms | Nicusan | JuliaCon Global 2025

AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms by Andrei-Leonard Nicusan PreTalx: https://pretalx.com/juliacon-2025/tal... In this talk I present AcceleratedKernels.jl, a library that provides a unified interface for writing parallel algorithms in Julia. The library is built on KernelAbstractions.jl, which allows high-level Julia code to be compiled into efficient kernels for a range of hardware. AcceleratedKernels.jl supports both multithreaded CPUs and GPUs from several vendors (CUDA, ROCm, oneAPI, Metal) using a single codebase. This design removes the need to write separate code for each target, making it easier for developers to write and maintain high-performance applications. Key points in the talk include: Unified Codebase: I describe how the same Julia user-code can be used to produce high-performance kernels for different hardware. Performance Benchmarks: I will present benchmark results that compare AcceleratedKernels.jl with traditional implementations. Benchmarks for operations like sorting, mapreduce, and arithmetic computations show that the performance of kernels generated by AcceleratedKernels.jl is comparable to that of code written in C with OpenMP (on CPUs) and vendor libraries like Nvidia Thrust (on GPUs). These tests have been run on different architectures, from desktop CPUs to data-center GPUs, and the results demonstrate competitive speed and scalability. Developer Experience: I will show how to write custom kernels in Julia with minimal changes to existing code - with the aim of writing a user application / library that transparently works across architectures, without special-cased kernels for GPUs or explicit multithreading. This also allows composable CPU-GPU co-processing across Julia libraries. Real-World Applications: I will discuss several use cases from scientific computing and industry where the ability to run the same code on different hardware is valuable. Examples include multi-node data sorting and numerical simulations - in particular Lagrangian simulations such as the Discrete Element Method, Molecular Dynamics, or N-Body Simulations - where parallel execution is critical. Future Work: I will outline planned improvements for AcceleratedKernels.jl, such as adding automated tuning for algorithm parameters, extending the range of available algorithms, and supporting emerging hardware platforms. I also discuss how contributions from the community can help shape the future of the library. AcceleratedKernels.jl was created to simplify parallel programming by reducing the need for hardware-specific code. Instead of writing separate kernels for each target, developers write a single function that runs across all supported devices. The talk will also include a live demonstration. I will write a simple kernel in Julia and show how it runs on both a CPU and a GPU without any modifications. I will discuss some challenges encountered during development, such as algorithm and interface design choices. Finally, I will place AcceleratedKernels.jl within the broader Julia ecosystem and show its composability across separate libraries. In summary, this session provides a detailed overview of AcceleratedKernels.jl, covering its design, performance, and practical applications. Attendees will learn how to write portable parallel code in Julia using a single, unified API and understand the trade-offs involved in cross-architecture programming. This talk is aimed at developers, researchers, and anyone interested in high-performance computing with Julia, and it offers practical insights into writing code that runs efficiently on modern hardware.

Comments