Building a RAG Chatbot with LangChain, ChromaDB, Streamlit & OpenAI

rjshk013 December 11, 2025

MLOPS Project Series-PART 2

Powered by OpenAI Embeddings + GPT, using my `rag-chatbot-langchain` repo

Large Language Models are powerful, but out of the box they don’t know anything about your PDFs, playbooks, or eBooks. That’s where Retrieval-Augmented Generation (RAG) comes in.

In this project, I built a document-aware chatbot that can answer questions based on any PDFs you drop into a folder — using:

LangChain for orchestration
ChromaDB as a vector store
OpenAI Embeddings + Chat Models for intelligence
Streamlit for the UI
Docker for easy deployment

The complete implementation lives here:

👉 https://github.com/rjshk013/mlops-project/tree/master/rag-chatbot-langchain

This article explains how it works and gives you a step-by-step guide to run it from my repo.

⚠️ Important: This Project Uses OpenAI Only (Not HuggingFace)

The current codebase is built specifically for OpenAI, not HuggingFace.

Concretely:

Embeddings are generated via:
from langchain_openai import OpenAIEmbeddings embeddings = OpenAIEmbeddings()
The LLM answering your questions uses:
from langchain_openai import ChatOpenAI llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0.7)

👉 This means:

You must have a valid OpenAI account
You must generate an OpenAI API key
You must set OPENAI_API_KEY in a .env file (or environment variable)
OpenAI will charge you per API usage (embeddings + chat)

There is no HuggingFace / local LLM integration in this version. If you remove the API key, the app will fail with authentication or quota errors.

🏗️ High-Level Architecture

Here’s what happens under the hood:

You place PDFs into the data/ folder
The app:

Loads the PDFs
Splits them into overlapping text chunks
Calls OpenAI Embeddings API to embed each chunk
Stores embeddings + metadata in ChromaDB (local folder chroma_db/

When you ask a question:

ChromaDB retrieves the most relevant chunks
LangChain builds a RAG prompt using those chunks as context
ChatOpenAI (GPT model) generates the final answer
Streamlit displays the response in a chat UI

It’s a classic RAG loop: Retrieve → Augment → Generate.

🧱 Project Structure (From the Repo)

Inside mlops-project/rag-chatbot-langchain:

rag-chatbot-langchain/
│── chatbot.py               # Streamlit app (RAG chatbot)
│── ingest_database.py       # Optional batch ingestion script
│── requirements.txt         # Python dependencies
│── Dockerfile               # Image definition
│── docker-compose.yml       # Orchestration
│── data/                    # Put your PDFs here
│── chroma_db/               # Chroma vector DB (auto-created)
│── .env                     # OpenAI API key (you create this)
└── README.md

🔑 OpenAI Requirement: Why We Need the API Key

Two core parts of the pipeline depend on OpenAI:

Embeddings during ingestion

OpenAIEmbeddings() calls the embeddings API for each chunk

LLM during question answering

ChatOpenAI() calls the chat/completions API to generate response

That’s why the .env must include:

OPENAI_API_KEY=sk-...

And why OpenAI asks for payment:
Every embedding + chat request uses compute on their GPU clusters, so they charge per token to cover infra costs.

🚀 Step-by-Step: How to Run This Project from My Repo

Here’s a clean, ordered guide to run this from scratch.

✅ Prerequisites

OpenAI Account with billing enabled
OpenAI API Key from:
https://platform.openai.com/api-keys
Git installed
EITHER:

Docker + Docker Compose (recommended), or
Python 3.10+ with pip

1️⃣ Clone the Repository

git clone https://github.com/rjshk013/mlops-project.git
cd mlops-project/rag-chatbot-langchain

2️⃣ Create `.env` with Your OpenAI API Key

In the rag-chatbot-langchain directory, create a file named .env:

nano .env

Add:

OPENAI_API_KEY=sk-your-key-here

🔐 Keep this file private. Don’t commit it to GitHub.

3️⃣ Add Your PDF Documents

Put one or more PDFs into the data/ folder:

mkdir -p data
cp /path/to/your-documents/*.pdf data/

Examples:

Kubernetes eBooks
Internal design docs
Policy PDFs
Training material

These are the knowledge base for your chatbot.

4️⃣ Run the App (Option A: Docker Compose) ✅ Recommended

Make sure Docker and Docker Compose are installed, then run:

docker-compose up --build

What this does:

Builds the container image using Dockerfile
Installs dependencies from requirements.txt
Mounts ./data and ./chroma_db into /app inside the container
Exposes port 8501 for the Streamlit app

Once it’s up, open:

👉 http://localhost:8501

5️⃣ Ingest Documents from the UI

Inside the Streamlit page:

Go to the left sidebar
Click 📥 Ingest Documents
The app will:

Load all PDFs from data/
Split them into text chunks
Call OpenAIEmbeddings to generate embeddings
Store the vector store in chroma_db/

If successful, you’ll see:

✅ Ingested X document chunks!

6️⃣ Ask Questions in the Chat

Now at the bottom of the page:

Type a question like:

“What is Kubernetes and why is it used?”
“According to the eBook, what are the main components of the Kubernetes control plane?”

The chat flow:

Uses st.chat_input to capture your prompt
Retrieves relevant chunks from Chroma
Builds a prompt via ChatPromptTemplate
Sends it to ChatOpenAI with the context
Streams the answer back into the UI

You should see document-grounded answers, not generic responses.

7️⃣ (Optional) Run Ingestion via Script Instead of UI

You also have ingest_database.py, which can perform ingestion from the CLI.

It uses:

from langchain_openai.embeddings import OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_community.document_loaders import PyPDFDirectoryLoader

Run it as:

python ingest_database.py

This will:

Load PDFs from data/
Generate embeddings (via OpenAI)
Populate the same chroma_db/ directory

After that, when you launch chatbot.py, it will attach to the existing vector store.

8️⃣ Run Locally Without Docker (Option B)

If you prefer to run it directly on your machine:

1. Create & activate virtualenv (optional but recommended)

python -m venv venv
source venv/bin/activate   # on Windows: venv\Scripts\activate

2. Install dependencies

pip install --upgrade pip
pip install -r requirements.txt

3. Ensure `.env` has your `OPENAI_API_KEY`

(As shown earlier)

4. Run the app

streamlit run chatbot.py

Then open:

👉 http://localhost:8501

🧪 How to Confirm It’s Really Using Your PDFs

A few good tests:

Ask about a very specific phrase from your PDF

“What does the document say about ‘kubelet and node agent’?”

Ask something not present in the documen

“Explain Istio multi-mesh, as per this document.”
— The bot should say it’s not mentioned.

Ask for a summary of a specific chapter / section

“Summarize the chapter that explains Kubernetes cluster components.”

If the answers match your docs and don’t hallucinate unrelated content, your RAG setup is working well.

🔮 Possible Next Steps (Future Work)

Right now, this project is:

✅ OpenAI only
✅ RAG-based
✅ Streamlit UI + Dockerized

You could extend it by:

Adding source citations (showing which PDF pages were used)
Adding upload-from-UI instead of manual data/ copy
Swapping OpenAI models (e.g., GPT-4o)
Adding rate limiting & logging
Creating a FastAPI backend and separating the UI

🙌 Conclusion

You’ve built a full Retrieval-Augmented Generation chatbot from scratch — including ingestion, embeddings, search, and a fully interactive UI.

This project is perfect for:

ML Engineers
MLOps portfolios
DevOps engineers learning GenAI
Interview preparation
Production RAG deployments

Tags: ChromaDB, langchain, mlops, openai, RAG, streamlit

Building a RAG Chatbot with LangChain, ChromaDB, Streamlit & OpenAI

Powered by OpenAI Embeddings + GPT, using my `rag-chatbot-langchain` repo

⚠️ Important: This Project Uses OpenAI Only (Not HuggingFace)

🏗️ High-Level Architecture

🧱 Project Structure (From the Repo)

🔑 OpenAI Requirement: Why We Need the API Key

🚀 Step-by-Step: How to Run This Project from My Repo

✅ Prerequisites

1️⃣ Clone the Repository

2️⃣ Create `.env` with Your OpenAI API Key

3️⃣ Add Your PDF Documents

4️⃣ Run the App (Option A: Docker Compose) ✅ Recommended

5️⃣ Ingest Documents from the UI

6️⃣ Ask Questions in the Chat

7️⃣ (Optional) Run Ingestion via Script Instead of UI

8️⃣ Run Locally Without Docker (Option B)

1. Create & activate virtualenv (optional but recommended)

2. Install dependencies

3. Ensure `.env` has your `OPENAI_API_KEY`

4. Run the app

🧪 How to Confirm It’s Really Using Your PDFs

🔮 Possible Next Steps (Future Work)

🙌 Conclusion

How to Optimize AWS NAT Gateway Costs

Building a Production-Ready Multi-Model NLP API with Hugging Face & Docker

Building a RAG Chatbot with LangChain, ChromaDB, Streamlit & OpenAI

How to Build a Complete MLOps Pipeline: Automated Machine Learning from Training to Production

CLDOP Real-World DevOps Project Series-Part 3

K8s Interview Series

Leave a Comment Cancel reply

Stay up to date with our blogs.

Subscribe to receive email notifications for new blog posts.

Building a RAG Chatbot with LangChain, ChromaDB, Streamlit & OpenAI

Powered by OpenAI Embeddings + GPT, using my rag-chatbot-langchain repo

⚠️ Important: This Project Uses OpenAI Only (Not HuggingFace)

🏗️ High-Level Architecture

🧱 Project Structure (From the Repo)

🔑 OpenAI Requirement: Why We Need the API Key

🚀 Step-by-Step: How to Run This Project from My Repo

✅ Prerequisites

1️⃣ Clone the Repository

2️⃣ Create .env with Your OpenAI API Key

3️⃣ Add Your PDF Documents

4️⃣ Run the App (Option A: Docker Compose) ✅ Recommended

5️⃣ Ingest Documents from the UI

6️⃣ Ask Questions in the Chat

7️⃣ (Optional) Run Ingestion via Script Instead of UI

8️⃣ Run Locally Without Docker (Option B)

1. Create & activate virtualenv (optional but recommended)

2. Install dependencies

3. Ensure .env has your OPENAI_API_KEY

4. Run the app

🧪 How to Confirm It’s Really Using Your PDFs

🔮 Possible Next Steps (Future Work)

🙌 Conclusion

How to Optimize AWS NAT Gateway Costs

Building a Production-Ready Multi-Model NLP API with Hugging Face & Docker

Building a RAG Chatbot with LangChain, ChromaDB, Streamlit & OpenAI

How to Build a Complete MLOps Pipeline: Automated Machine Learning from Training to Production

CLDOP Real-World DevOps Project Series-Part 3

K8s Interview Series

Leave a Comment Cancel reply

Stay up to date with our blogs.

Subscribe to receive email notifications for new blog posts.

Powered by OpenAI Embeddings + GPT, using my `rag-chatbot-langchain` repo

2️⃣ Create `.env` with Your OpenAI API Key

3. Ensure `.env` has your `OPENAI_API_KEY`