GitHub - saral7293/COVID-19-CHATBOT-Retrieval-Augmented-Generation-: This repo is for a RAG based chatbot for COVID-19 built using OpenAI GPT 3.5 Turbo API, Langchain, chromadb vector database and flask.

About

This repository contains a Chatbot application built using Streamlit, LangChain, and ChromaDB, designed to provide important information about COVID-19 based on the contents of PDF files. The chatbot utilizes a Retrieval-Augmented-Generation (RAG) approach, where relevant information is retrieved from a vector database and then processed by a language model to generate a final answer.

Features

PDF Ingestion: The chatbot can ingest PDF files containing COVID-19 related information and create a vector database using ChromaDB.
Question Answering: Users can ask questions related to COVID-19, and the chatbot will retrieve relevant information from the vector database and generate a response using a language model.
Streamlit Interface: The chatbot has a user-friendly interface built with Streamlit, allowing users to interact with the application through a web-based interface.

Architecture

The chatbot follows a Retrieval-Augmented-Generation (RAG) approach, which combines retrieval and generation techniques to provide accurate and relevant answers. The architecture consists of the following components:

Document Loader: Loads PDF files from the Books folder and splits them into smaller text chunks.
Vector Database: The text chunks are converted into vector embeddings using OpenAI's embeddings and stored in a ChromaDB vector database.
Similarity Search: When a user asks a question, relevant text chunks are retrieved from the vector database based on their similarity to the question.
Language Model: The retrieved text chunks are passed to a language model (GPT-3.5-turbo) along with the user's question. The model generates a final answer based on the provided context.
Streamlit Interface: The user interface is built using Streamlit, allowing users to interact with the chatbot through a web-based interface.

Installation

Clone the repositry:

git clone https://github.com/saral7293/COVID-19-CHATBOT-Retrieval-Augmented-Generation-

Navigate to the project directory:

cd COVID-19-CHATBOT-Retrieval-Augmented-Generation

Install the required dependencies:

pip install -r requirements.txt

Set your OpenAI API key as an environment variable:

export OPENAI_API_KEY=your_openai_api_key

Usage

To run the chatbot locally using Streamlit, execute the following command:

streamlit run app.py

This will start the Streamlit application, and you can interact with the chatbot through the web interface.

Deployment

This chatbot has been deployed on an EC2 instance in a VM.

Contribution

Contributions are welcome! If you find any issues or want to add new features, please open an issue or submit a pull request.

FutureWork

Fine tune the chatbot and apply Advance RAG techniques.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.devcontainer		.devcontainer
Books		Books
chroma		chroma
README.md		README.md
__pycache__.zip		__pycache__.zip
app.py		app.py
create_database.py		create_database.py
query_data.py		query_data.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Table of Contents

Features

Architecture

Installation

Usage

Deployment

Contribution

FutureWork

About

Releases

Packages

Languages

saral7293/COVID-19-CHATBOT-Retrieval-Augmented-Generation-

Folders and files

Latest commit

History

Repository files navigation

About

Table of Contents

Features

Architecture

Installation

Usage

Deployment

Contribution

FutureWork

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages