Skip to content

An advanced AI-powered Streamlit application that transforms how you interact with PDF documents. Perfect for students, researchers, and professionals who need to extract insights, generate summaries and create study materials from their documents.

Notifications You must be signed in to change notification settings

cu-sanjay/PDF-GPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

PDF‑GPT · Chat With Your PDFs Using AI

Deployed on Render free plan. If the service shows Inactive, it may take about 50 seconds to wake. Due to monthly quota limits, the service may be unavailable at times. If that happens, clone the repository, set your own Google Gemini API key in a .env file, and run locally. A short guide is included below.

Open on Render Streamlit App

Try now

PDF‑GPT live demo

Overview

PDF‑GPT is a Streamlit application that lets you ask questions about your PDF files and generate summaries, practice questions, MCQs, and study notes. It uses Google Gemini for language reasoning, LangChain for text processing, and FAISS for vector search.

Key features

Core

  • Multiple PDF upload and processing
  • Chat over documents with Gemini
  • Vector search using FAISS for relevant answers

Study tools

  • Document summarisation
  • Question generation with answers
  • MCQ generation
  • Structured study notes
  • Instant answers for quick lookups

Architecture (high level)

  • UI: Streamlit
  • PDF parsing: PyPDF2 (text extraction)
  • Indexing: LangChain text splitters + FAISS vector store
  • LLM: Google Gemini (via google-generativeai and langchain-google-genai)
  • Config: .env for secrets and environment settings

Quick start

1) Clone and set up

git clone https://github.com/cu-sanjay/PDF-GPT.git
cd PDF-GPT

python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -r requirements.txt

2) Configure environment

Create a .env file in the project root:

# Required
GOOGLE_API_KEY=your_google_ai_api_key

# Optional
GEMINI_MODEL=gemini-2.0-flash
EMBEDDINGS_MODEL=text-embedding-004

Get a free Google AI API key from Google AI Studio.

3) Run locally

# Option A
streamlit run app.py

# Option B (works even if streamlit is not on PATH)
python -m streamlit run app.py

Open http://localhost:8501 in your browser.

Deployment

One‑click on Render

Use the button below to deploy your own copy on Render. Set the GOOGLE_API_KEY environment variable in the Render dashboard.

Deploy to Render

Start command (Render):

python -m streamlit run app.py --server.address=0.0.0.0 --server.port=$PORT

Environment variables (Render):

  • GOOGLE_API_KEY (required)
  • GEMINI_MODEL (optional, default gemini-2.0-flash)
  • EMBEDDINGS_MODEL (optional, default text-embedding-004)

Streamlit Community Cloud

  1. Push the repository to GitHub.
  2. Create a new app on Streamlit Cloud and select this repo.
  3. Main file: app.py.
  4. Add secrets in Settings → Secrets:
GOOGLE_API_KEY = "your_google_ai_api_key"
GEMINI_MODEL = "gemini-2.0-flash"
EMBEDDINGS_MODEL = "text-embedding-004"

Do not commit the real .env. Commit .env.example only.

Commands

Local development:

python -m streamlit run app.py

Container or PaaS environments:

python -m streamlit run app.py --server.address=0.0.0.0 --server.port=${PORT:-8501}

Requirements

  • Python 3.8 or later
  • A valid Google AI API key

Troubleshooting

  • GOOGLE_API_KEY not found: create a .env file or set the variable on the host platform.
  • Streamlit not found: run with python -m streamlit run app.py and ensure dependencies are installed.
  • PDF cannot be read: the file may be image‑only or password protected.
  • No text extracted: OCR is not included. Use text‑based PDFs or add OCR before upload.
  • Large files: split large PDFs or process fewer files at a time.
  • Cold start on Render: wait for the free instance to wake.

Security and privacy

  • PDFs are processed in memory during a session.
  • Do not commit confidential files or keys.
  • Review and follow your organisation policies when handling documents.

Acknowledgements

The project uses Google Gemini, LangChain, FAISS, and Streamlit.

About

An advanced AI-powered Streamlit application that transforms how you interact with PDF documents. Perfect for students, researchers, and professionals who need to extract insights, generate summaries and create study materials from their documents.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages