У нас вы можете посмотреть бесплатно Byte-Sized GPT: Generative AI with JAX // Hi, JAX! Act II // Lecture 08 или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Welcome to the third lecture for Act II of "Hi, JAX!", an introduction to vanilla JAX for deep learning research! Today we're going to accelerate and train a 2.5 million parameter generative transformer on the 3.9 million bytes comprising the entire Sherlock Holmes canon. If you ignore tokenisation and a few orders of magnitude in data, parameters, and training steps, that's basically like saying we're compiling GPT! To do so, we're going to have to learn all about static parameters in JAX. Note: This is the longest Hi, JAX! tutorial yet, and I already cut out 2hrs of footage implementing the transformer architecture! Feel free to tap out after about 1hr43mins when we train the transformer successfully for the first time; or, if you are ambitious, stay around for the last quarter where I improvise accelerating the autoregressive completion loop itself using jax.lax.dynamic_slice. Links: Course webpage: https://github.com/matomatical/hijax Course playlist: • Hi, JAX! Introduction to vanilla JAX for d... Chapters: 0:00:00 Introduction 0:01:37 We're on a TPU today 0:03:06 Static parameters and recompilation 0:10:44 Only some types can be static/dynamic 0:22:06 Starter code 0:28:26 Loading the data 0:34:53 Accelerating model initialisation 0:46:07 Accelerating the forward pass 0:47:18 Testing the model 0:56:52 Implementing token generation 1:26:02 Cumulative generation during training 1:43:55 Accelerating the completion loop 1:48:31 Accelerating the completion loop properly 2:05:00 Debugging 2:24:39 Challenge