This project is a RESTful wrapper for LLM functionalities.
In case if there is an Nvidia GPU, you need Nvidia's docker toolkit
# on Arch
yay -S nvidia-container-toolkit
sudo systemctl restart docker
Go to the root of the project.
# or do it manually
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# to run tests
pytest
# for production
gunicorn --workers 1 --timeout 300 --bind 0.0.0.0:8000 main:app
# when all is instlled, you can use a script
# to start the server.
./run
If new libraries are added, run
pip freeze > requirements.txt
Takes POST with
{
"texts": [
"first text",
"second text",
...
]
}
Response contains an array of embeddings with 384-dimentional vectors.
From the root of the project
docker build -t gnames/llmutil:latest .
Then run:
docker run -d --workers 1 --gpus all -p 8000:8000 gnames/llmutil:latest
Do not use --gpus all
option if you do not have GPU.
Tests are located in tests
directory.
Install pytest
and run:
pytest