Скачать с ютуб видео How fast? NVIDIA DGX Spark and DGX Station token/s speed prediction

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Скачать видео с ютуб по ссылке или смотреть без блокировок на сайте: How fast? NVIDIA DGX Spark and DGX Station token/s speed prediction в качестве 4k

У нас вы можете посмотреть бесплатно How fast? NVIDIA DGX Spark and DGX Station token/s speed prediction или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:

Информация по загрузке:

Скачать mp3 с ютуба отдельным файлом. Бесплатный рингтон How fast? NVIDIA DGX Spark and DGX Station token/s speed prediction в формате MP3:

Если кнопки скачивания не загрузились НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу страницы.
Спасибо за использование сервиса ClipSaver.ru

How fast? NVIDIA DGX Spark and DGX Station token/s speed prediction

NVIDIA DGX Spark and DGX Station have been announced by NVIDIA at the 2025 GTC. In this video I predict the speed you can expect these NVIDIA machines to perform at, in token/s. I also give details on the calculation method I use, so you can use it yourself with other models of your choice ( memory bandwidth / model size = theoretical limit). I make assumptions about real-world factors and real-world performance. For DGX Station, I discuss the GPU memory, and the CPU memory, and how they need to be treated differently in this calculation. As always, I'm curious on your thoughts about this. Highlights DGX Spark: DGX Spark will run the DeepSeek R1 Distill Qwen 32B Q8 at 3.5 - 6.7 token/s. DGX Spark will run the 70B LLAMA (Q4) at about 1.3 token/s - 3.6 token/s. Probably a little bit slower than spoken text. DGX Spark will run the 70B LLAMA (Q8) at about 1 token/s - 3 token/s. (This is the same value as DeepSeek R1 Distill Llama 70B Q8 will achieve) Highlights DGX Station: DGX Station will run DeepSeek R1 Distill Llama 70B Q8 at 32 token/s - 85 token/s DGX Station will run the 70B LLAMA FP16 at about 17 token/s - 45 token/s DGX Station will run DeepSeek R1 Q2_K_XS at about 11 token/s - 29 token/s. This will be the largest DeepSeek R1 Quant I predict to fit comfortably into the GPU RAM. For bigger quants: DeepSeek R1 Q4_K_M - I predict a performance on DGX Station of 1 token/s - 2.7 token/s. So quite slow. If you want better performance, or run a less quantized version of DeepSeek R1 on DGX Station, you will need a couple of them (maybe 3-4). I predict that DGX Station will cost between $150k to $250k - stay tuned and subscribe to find out more in an upcoming video. The data is based on this fantastic article here: https://www.hardware-corner.net/guide... Thank you to Allan Witt of HardwareCorner Subscribe to my channel for more tips on AI for managers, entrepreneurs and business people, upcoming AI tools which will save you time and make you money.

Comments