Tubechat — Chat with Any YouTube Video

A multimodal AI web application that lets you have a conversation with any YouTube video. Paste a link, and get answers grounded in the transcript — with source excerpts and a confidence score.

Built as part of the Building Multimodal AI Applications with LangChain & the OpenAI API project (July 2025).

What It Does

Paste any YouTube URL and click Analyze
The app downloads the audio, transcribes it using Azure Whisper
The transcript is chunked, embedded with Azure text-embedding-3-small, and stored in a FAISS vector store
Ask questions in natural language — Azure GPT-4o retrieves the most relevant transcript chunks and returns a structured answer with confidence level and source excerpts
Session history is shown in the collapsible sidebar

Tech Stack

Layer	Technology
Frontend	React 18, Vite
Backend	FastAPI, Python
Transcription	Azure OpenAI Whisper
Embeddings	Azure OpenAI text-embedding-3-small
Vector Store	FAISS (local)
LLM	Azure OpenAI GPT-4o
Orchestration	LangChain
Audio Download	yt-dlp

Project Structure

tubechat/
├── backend/
│   ├── main.py           # FastAPI app — API routes
│   ├── models.py         # Pydantic request/response schemas
│   ├── downloader.py     # yt-dlp YouTube audio download
│   ├── transcriber.py    # Azure Whisper transcription
│   ├── qa_chain.py       # LangChain FAISS + GPT-4o Q&A chain
│   ├── requirements.txt
│   ├── .env.example      # Environment variable template
│   └── Procfile          # Render deployment
└── frontend/
    ├── src/
    │   ├── App.jsx
    │   ├── config.js
    │   └── components/
    │       ├── Sidebar.jsx
    │       ├── LandingPage.jsx
    │       ├── ChatPage.jsx
    │       ├── UrlInput.jsx
    │       ├── MessageBubble.jsx
    │       └── AnswerCard.jsx
    ├── .env.example       # Environment variable template
    └── vercel.json        # Vercel deployment config

Prerequisites

Python 3.10+
Node.js 18+
FFmpeg on PATH — needed by yt-dlp for audio conversion
```
# Windows
winget install Gyan.FFmpeg
```
An Azure AI Foundry project with three deployed models:
- gpt-4o
- text-embedding-3-small
- whisper

Local Setup

1. Clone the repository

git clone https://github.com/khavya-798/tubechat.git
cd tubechat

2. Backend setup

cd backend

# Create and activate virtual environment
python -m venv venv
venv\Scripts\activate        # Windows
# source venv/bin/activate   # Mac/Linux

# Install dependencies
pip install -r requirements.txt

# Create your environment file
copy .env.example .env       # Windows
# cp .env.example .env       # Mac/Linux

Edit backend/.env and fill in your Azure credentials:

AZURE_OPENAI_KEY=your-key-here
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_CHAT_DEPLOYMENT=gpt-4o
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT=text-embedding-3-small
AZURE_OPENAI_WHISPER_DEPLOYMENT=whisper
AZURE_OPENAI_API_VERSION=2024-12-01-preview
ALLOWED_ORIGINS=http://localhost:5173,http://localhost:3000

Start the backend:

uvicorn main:app --reload --port 8000

Backend runs at http://localhost:8000

3. Frontend setup

Open a new terminal:

cd frontend

# Install dependencies
npm install

# Create your environment file
copy .env.example .env       # Windows
# cp .env.example .env       # Mac/Linux

The default frontend/.env works out of the box for local dev (Vite proxies /api/* to the backend automatically).

Start the frontend:

npm run dev

Frontend runs at http://localhost:5173

How to Use

Open http://localhost:5173 in your browser
Paste any YouTube URL into the input bar
Click Analyze — the app downloads, transcribes, and indexes the video (takes 15–60 seconds depending on video length)
Once ready, the chat interface opens — ask anything about the video
Each answer shows a confidence level (high / medium / low) and source excerpts from the transcript

Limitations

Maximum video length: ~50 minutes (Azure Whisper 25 MB file limit)
Sessions are in-memory — restart clears all sessions
Transcript cache and FAISS index are stored locally and not persisted across server restarts on cloud free tiers

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tubechat — Chat with Any YouTube Video

What It Does

Tech Stack

Project Structure

Prerequisites

Local Setup

1. Clone the repository

2. Backend setup

3. Frontend setup

How to Use

Limitations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Tubechat — Chat with Any YouTube Video

What It Does

Tech Stack

Project Structure

Prerequisites

Local Setup

1. Clone the repository

2. Backend setup

3. Frontend setup

How to Use

Limitations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages