У нас вы можете посмотреть бесплатно Build an ElevenLabs Clone: PyTorch, Next.js 15, AWS, Inngest, FastAPI, React, Tailwind (2025) или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Source code: https://github.com/Andreaswt/elevenla... Part 2: • (2/2) Build an ElevenLabs Clone: PyTo... Discord & More: https://andreastrolle.com Inngest: https://innge.st/yt-andreas-1 Hi 🤙 In this video, you'll build a full-stack ElevenLabs clone with text-to-speech, voice conversion, and audio generation. Some tutorials would just call an API like ElevenLabs', but not us! Instead of external API services, you'll self-host three AI models (StyleTTS2, Seed-VC, and Make-An-Audio) from GitHub, fine-tune them to specific voices, then containerize them with Docker and expose inference endpoints via FastAPI. The AI backend will be built using Python and PyTorch. You'll create a Next.js application where users can use the AI models to generate audio, and also switch between voices and view previously generated audio files, stored in an S3 bucket. The project includes user authentication, a credit system, and an Inngest queue to prevent overloading of the server hosting the AI models. The web application is built on the T3 Stack with Next.js, React, Tailwind, and Auth.js. Follow along for the entire process from development to deployment. Features 🔊 Text-to-speech synthesis with StyleTTS2 🎭 Voice conversion with Seed-VC 🎵 Audio generation from text with Make-An-Audio 🤖 Custom voice fine-tuning capabilities 🐳 Docker containerization of AI models 🚀 FastAPI backend endpoints 🔄 Inngest queue to prevent server overload 📊 User credit management system 💾 AWS S3 for audio file storage 👥 Multiple pre-trained voice models 📱 Responsive Next.js web interface 🔐 User authentication with Auth.js 🎛️ Voice picker 📝 Generated audio history 🎨 Modern UI with Tailwind CSS 💲 Costs + How to follow along for free The total fine-tuning cost for both models is ~5-10 USD. When deploying the endpoint it’s ~1 USD per hour of uptime. S3 is really cheap. IAM roles, users etc are free. Following along for free: -When building the next.js application in part 2 of the video, I create a mock endpoint that means you don't have to host the AI models with EC2 unless you want to learn it. You can just use that mock endpoint throughout the video. -Don't fine-tune the models, but just use the model files (.pth) made by the researchers, as I also do before fine-tuning. -Don't create EC2 instances. They are the main cost driver. -S3 buckets are required for the voice-to-voice feature. If you want this feature, stay within the 5GB free tier. See more under storage and S3 here: https://aws.amazon.com/free -You can of course still follow the video, learn the concepts, and code along. You can also test the docker containers for training locally, without training the model, to learn as much as possible, without the actual fine-tuning. 📖 Chapters 00:00:00 Demo 00:03:45 Theory and Plan 00:46:34 Python installation 00:47:50 TTS w. StyleTTS2 01:28:40 Preparing dataset 01:45:03 Fine-tune preparation 02:22:28 AWS setup 02:32:25 EC2 fine-tuning 02:49:21 API for TTS 03:49:29 Voice-changer w. seed-vc 04:02:51 Seed-vc fine-tuning 04:12:14 Seed-vc API 04:52:42 Text-to-SFX w. make-an-audio 05:00:21 Text-to-SFX API 05:18:19 Docker-compose 05:21:05 Deploying AIs to EC2