Talk to your videos.
Your AI agents don't mind the length.
Open-source AI for long-form video Q&A. Self-host on your GPU, or plug into any provider.
MIT licensed · Self-hostable · Bring your own models
demo.mp4 · coming soon
agentic
Watch the agent think
A tool-using agent searches, identifies speakers, and pinpoints the exact moment. Every step streams live, so you see how the answer was built.
indexed
Understands before you ask
Each video gets a global summary, hierarchical time-window indexes, and a speaker inventory at ingestion. Scales to multi-hour recordings.
flexible
Bring your own providers
OpenAI-compatible VLM, OpenAI-compatible ASR with diarization, OpenAI- or Cohere-compatible embedder, plus an optional reranker. Or run everything fully local on your own GPU.
Self-host in three commands
Bring your own provider, or run everything fully local on your own GPU.
# clone, configure, run
git clone https://github.com/adnane-errazine/OpenVideoSearch.git
cp .env.example .env
docker compose up
git clone https://github.com/adnane-errazine/OpenVideoSearch.git
cp .env.example .env
docker compose up