У нас вы можете посмотреть бесплатно GGUF quantization of LLMs with llama cpp или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Would you like to run LLMs on your laptop and tiny devices like mobile phones and watches? If so, you will need to quantize LLMs. LLAMA.cpp is an open-source library written in C and C++. It allows us to quantize a given model and run LLMs without GPUs. In this video, I demonstrate how we can quantize a fine-tuned LLM on a Macbook and run it on the same Macbook for inference. I quantize the fine-tuned Gemma 2 Billion parameter model that we fine-tuned in my previous tutorial but you can use the same steps for quantizing any other fine-tuned LLMs of your choice. MY KEY LINKS YouTube: / @aibites Twitter: / ai_bites Patreon: / ai_bites Github: https://github.com/ai-bites WHO AM I? I am a Machine Learning researcher/practitioner who has seen the grind of academia and start-ups. I started my career as a software engineer 15 years ago. Because of my love for Mathematics (coupled with a glimmer of luck), I graduated with a Master's in Computer Vision and Robotics in 2016 when the now happening AI revolution started. Life has changed for the better ever since. #machinelearning #deeplearning #aibites