This project implements a search API using FastAPI and PostgreSQL. It allows users to search for documents based on a query and ranks the search results using the TF-IDF (Term Frequency-Inverse Document Frequency) algorithm.
- Search for documents based on a query
- Rank search results using TF-IDF scoring
- Retrieve document titles, URLs, and scores in search results
- Backend implemented in Python using FastAPI and psycopg2
- Frontend implemented in React.js
- Database storage and retrieval using PostgreSQL
-
Clone the repository:
git clone https://github.com/pavanmanishd/search-engine.git
-
Stop Postgres before starting the Docker container:
systemctl stop postgresql
-
Run Docker container:
cd server docker compose up
If you want to run the Docker container in the background, you can detach it by adding the
-d
flag:docker compose up -d
-
Note: After running the Docker container, please allow some time for the crawler to crawl all the websites. Currently, it is limited to the react.dev subdomain. You can monitor the progress by checking the server logs:
docker compose logs --follow
-
Run the frontend to try it. Go to the root directory then:
cd client npm install npm run dev
-
Try the search engine by visiting localhost:5173.
-
Send a POST request to
/search
with a JSON payload containing the query:curl -X 'POST' \ 'http://localhost:8000/search' \ -H 'Content-Type: application/json' \ -d '{ "query": "your_search_query" }'
-
Example response:
[ { "url": "https://example.com/document1", "title": "Document 1 Title", "score": 0.784 }, { "url": "https://example.com/document2", "title": "Document 2 Title", "score": 0.657 } ]