Cache-Cool is a simple LLM (Large Language Model) caching proxy for saving your LLM calls. It acts as a caching layer for LLM API calls, such as OpenAI or Claude, to improve performance and reduce costs by avoiding redundant requests to the LLM providers. The caching is implemented using both MongoDB and JSON files.
- GitHub Repository: https://github.com/msnp1381/Cache-Cool
- Project Name: Cache-Cool
- Project Description: A simple LLM caching proxy for saving your LLM calls.
- πΎ Cache Responses: Caches responses from LLM API calls to reduce redundancy.
- βοΈ Dynamic Configuration: Allows dynamic configuration of LLM service and caching mechanisms via the
/configure
endpoint. - π Supports Multiple LLMs: Configurable to support different LLM services (e.g., OpenAI, Claude).
- π Uses MongoDB and JSON for Caching: Leverages both MongoDB and JSON files for caching API responses.
- POST /{schema_name}/chat/completions:
schema_name is defined in confing.yaml
Forwards chat completion requests to the configured LLM service or returns cached responses.
- GET /configure: Retrieves current configuration details.
- PUT /configure: Updates configuration settings dynamically.
Before you start, make sure you have:
- π³ Docker: Installed on your system. Download Docker here
- π MongoDB: A running MongoDB instance for caching (local or remote).
- π Redis: Optional A running Redis instance for caching.
-
Clone the repository:
First, download the project files:
git clone https://github.com/msnp1381/cache-cool.git cd cache-cool
-
Build the Docker Image:
Now, create a Docker image for the project:
docker build -t cache-cool .
-
Run the Docker Container:
Make sure MongoDB is running and accessible. Update the
config.yaml
with your MongoDB connection details, then run:docker run -p 8000:8000 --env-file .env cache-cool
Replace
.env
with your environment file containing necessary environment variables (like MongoDB URI). -
Access the Application:
Open your browser and go to http://localhost:8000 to start using Cache-Cool!
-
Clone the repository:
First, download the project files:
git clone https://github.com/msnp1381/cache-cool.git cd cache-cool
-
Install Python Dependencies:
If you prefer using
requirements.txt
, install the dependencies as follows:python3 -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate` pip install -r requirements.txt
-
Run the Application with Uvicorn:
Start the FastAPI application using Uvicorn:
uvicorn app.main:app --reload
This will start the server at http://localhost:8000.
cache-cool uses a config.yaml
file for initial configuration. You can also update configurations dynamically using the /configure
endpoint.
llm_schemas:
openai:
endpoint: "https://api.openai.com/v1/chat/completions"
headers:
- "Content-Type: application/json"
- "Authorization: Bearer {api_key}"
temperature_threshold: 0.8
claude:
endpoint: "https://api.claude.ai/v1/chat/completions"
headers:
- "Content-Type: application/json"
- "Authorization: Bearer {api_key}"
temperature_threshold: 0.85
avalai:
endpoint: "https://api.avalapis.ir/v1/chat/completions"
headers:
- "Content-Type: application/json"
- "Authorization: {api_key}"
temperature_threshold: 0.85
mongodb:
uri: "mongodb://localhost:27017"
db_name: "llm_cache_db"
collection_name: "cache"
json_cache_file: "cache.json"
redis:
enabled: false
host: "localhost"
port: 6379
db: 0
current_llm_service: "openai"
use_json_cache: true
use_mongo_cache: true
Hereβs how to use the API once the service is running:
look at usage.ipynb
We welcome contributions! Hereβs how you can help:
- Fork the repository.
- Create a new branch (
git checkout -b feature-branch
). - Make your changes.
- Commit your changes (
git commit -am 'Add some feature'
). - Push to the branch (
git push origin feature-branch
). - Create a new Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
If you have any questions or issues, feel free to contact us at mohamadnematpoor@gmail.com.
Happy caching! π