Deployed on Render free plan. If the service shows Inactive, it may take about 50 seconds to wake. Due to monthly quota limits, the service may be unavailable at times. If that happens, clone the repository, set your own Google Gemini API key in a .env file, and run locally. A short guide is included below.
PDF‑GPT is a Streamlit application that lets you ask questions about your PDF files and generate summaries, practice questions, MCQs, and study notes. It uses Google Gemini for language reasoning, LangChain for text processing, and FAISS for vector search.
- Multiple PDF upload and processing
- Chat over documents with Gemini
- Vector search using FAISS for relevant answers
- Document summarisation
- Question generation with answers
- MCQ generation
- Structured study notes
- Instant answers for quick lookups
- UI: Streamlit
- PDF parsing: PyPDF2 (text extraction)
- Indexing: LangChain text splitters + FAISS vector store
- LLM: Google Gemini (via google-generativeai and langchain-google-genai)
- Config: .env for secrets and environment settings
git clone https://github.com/cu-sanjay/PDF-GPT.git
cd PDF-GPT
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
Create a .env
file in the project root:
# Required
GOOGLE_API_KEY=your_google_ai_api_key
# Optional
GEMINI_MODEL=gemini-2.0-flash
EMBEDDINGS_MODEL=text-embedding-004
Get a free Google AI API key from Google AI Studio.
# Option A
streamlit run app.py
# Option B (works even if streamlit is not on PATH)
python -m streamlit run app.py
Open http://localhost:8501 in your browser.
Use the button below to deploy your own copy on Render. Set the GOOGLE_API_KEY
environment variable in the Render dashboard.
Start command (Render):
python -m streamlit run app.py --server.address=0.0.0.0 --server.port=$PORT
Environment variables (Render):
GOOGLE_API_KEY
(required)GEMINI_MODEL
(optional, defaultgemini-2.0-flash
)EMBEDDINGS_MODEL
(optional, defaulttext-embedding-004
)
- Push the repository to GitHub.
- Create a new app on Streamlit Cloud and select this repo.
- Main file:
app.py
. - Add secrets in Settings → Secrets:
GOOGLE_API_KEY = "your_google_ai_api_key"
GEMINI_MODEL = "gemini-2.0-flash"
EMBEDDINGS_MODEL = "text-embedding-004"
Do not commit the real
.env
. Commit.env.example
only.
Local development:
python -m streamlit run app.py
Container or PaaS environments:
python -m streamlit run app.py --server.address=0.0.0.0 --server.port=${PORT:-8501}
- Python 3.8 or later
- A valid Google AI API key
- GOOGLE_API_KEY not found: create a
.env
file or set the variable on the host platform. - Streamlit not found: run with
python -m streamlit run app.py
and ensure dependencies are installed. - PDF cannot be read: the file may be image‑only or password protected.
- No text extracted: OCR is not included. Use text‑based PDFs or add OCR before upload.
- Large files: split large PDFs or process fewer files at a time.
- Cold start on Render: wait for the free instance to wake.
- PDFs are processed in memory during a session.
- Do not commit confidential files or keys.
- Review and follow your organisation policies when handling documents.
The project uses Google Gemini, LangChain, FAISS, and Streamlit.