У нас вы можете посмотреть бесплатно LangChain RAG Project | Part 2: Chunking, FAISS Vector Store & LLM Integration | Video #47 или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Video #47: RAG Project Part 2: Chunking, Vector Stores & Final Implementation
🚀 Follow the Full Course Playlist here: • LangChain Full Course: Step-by-Step Tutori...
We are completing the puzzle! 🧩 In Video #47 of our LangChain Full Course, we take the YouTube transcript from the previous video and turn it into a fully functional RAG-based AI Assistant.
In this practical, we cover the critical "middle steps" of AI engineering: splitting long text into manageable chunks, converting them into high-dimensional vectors with OpenAI, and storing them in a high-performance FAISS vector store. Finally, we build a custom prompt that forces the LLM to answer only from your provided context, effectively eliminating hallucinations.
✅ In this video, we cover:
Recursive Document Chunking: Why chunk_size=500 is the "sweet spot" for transcripts.
FAISS Vector Store: Implementing a lightning-fast local vector database.
OpenAI Embeddings: Using the text-embedding-3-small model for cost-effective semantic search.
Context Augmentation: How to join retrieved documents into a single block of knowledge.
Prompt Engineering: Writing a strict system prompt to ensure the AI stays "grounded" in the transcript.
The Final Demo: Testing the system with a specific question about the video content.
Full Project Code
from youtube_transcript_api import YouTubeTranscriptApi
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_core.prompts import PromptTemplate
from dotenv import load_dotenv
load_dotenv()
1. Extraction (Recap)
ytt = YouTubeTranscriptApi()
transcript_obj = ytt.fetch(video_id="qEfPBt9dU60", languages=['en'])
subtitles = [snippet.text for snippet in transcript_obj]
finaltranscript = " ".join(subtitles)
2. Document Chunking
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
listofDocs = splitter.create_documents(texts=[finaltranscript])
3. Vector Store & Embeddings
embeddings = OpenAIEmbeddings(model='text-embedding-3-small')
vector_store = FAISS.from_documents(documents=listofDocs, embedding=embeddings)
retriever = vector_store.as_retriever()
4. Retrieval
question = 'What will happen to the astronauts who are watching the explosion?'
results = retriever.invoke(question)
doclist = [doc.page_content for doc in results]
AugmentedText = "
".join(doclist)
5. LLM Generation with Grounded Prompt
llm = ChatOpenAI(model='gpt-3.5-turbo')
template = PromptTemplate(template="""
You are a helpful Assistant.
Answer from the provided transcript context only.
Very Important: If context is insufficient, just say it is not mentioned in the provided context.
Following is my context:
{context}
Question: {question}
""", input_variables=['context', 'question'])
prompt = template.invoke({'context': AugmentedText, 'question': question})
result = llm.invoke(prompt)
print("--- AI RESPONSE ---")
print(result.content)
#LangChain #RAG #FAISS #OpenAI #AIProject #VectorStore #PromptEngineering #SemanticSearch #LLM #PythonAI #GenerativeAI #CodingTutorial #AIEngineering