RAG Knowledge Base: Firecrawl → ChromaDB → Claude
Build a production-grade RAG system that scrapes any documentation site, embeds it into ChromaDB, and answers questions with Claude using retrieved context.
Tools Used
Purpose
Why this workflow exists
Workflow Steps
Use Firecrawl's API to crawl documentation sites and convert pages to clean markdown. Configure URL filters and depth to target specific sections.
Split the markdown into semantic chunks using LangChain's RecursiveCharacterTextSplitter (512 tokens, 50 overlap). Generate embeddings with OpenAI's text-embedding-3-small.
Create a ChromaDB collection and upsert all vectors with metadata: source URL, page title, chunk position, last crawled date.
Create a LangChain retriever that searches ChromaDB for relevant chunks (top-k=5), re-ranks by relevance, and passes them as context to Claude.
Build a Next.js chat UI using the Vercel AI SDK to stream Claude's responses. Add source citations, conversation history, and feedback buttons.
Expected Results
What this workflow should unlock
What you get at the end
Build a production-grade RAG system that scrapes any documentation site, embeds it into ChromaDB, and answers questions with Claude using retrieved context.
data pipeline
Operational upside
Instead of rethinking the process each time, you reuse the same sequence across planning, execution, and refinement with Firecrawl, ChromaDB, Anthropic Claude API.
repeatable execution
Team-facing outcome
Use Firecrawl's API to crawl documentation sites and convert pages to clean markdown. Configure URL filters and depth to target specific sections.
less manual coordination
Next-level refinement
Build a Next.js chat UI using the Vercel AI SDK to stream Claude's responses. Add source citations, conversation history, and feedback buttons.
easy to iterate
Common Questions
Quick answers before you start
What is the main purpose of RAG Knowledge Base: Firecrawl → ChromaDB → Claude?
Build a production-grade RAG system that scrapes any documentation site, embeds it into ChromaDB, and answers questions with Claude using retrieved context.
How many tools do I actually need to start?
You can usually start with the core set listed here. This idea currently references 4 tools, but you do not need to adopt every tool on day one.
Is this workflow suitable for my experience level?
Yes, as long as you treat the current setup as advanced. The workflow structure stays the same; the difference is how much customization and orchestration you add.
How long does it take to put this into practice?
Most teams can stand up an initial version quickly because the workflow already breaks into 5 concrete steps. The refinement phase usually takes longer than the first draft.
