Skip to content

Latest commit

 

History

History
153 lines (111 loc) · 2.48 KB

API.md

File metadata and controls

153 lines (111 loc) · 2.48 KB

LocalLab API Documentation

REST API Endpoints

Text Generation

POST /generate

Generate text using the loaded model.

Request Body:

{
  "prompt": "string",
  "model_id": "string | null",
  "stream": "boolean",
  "max_length": "integer | null",
  "temperature": "float",
  "top_p": "float"
}

Response:

{
  "response": "string",
  "usage": {
    "prompt_tokens": "integer",
    "completion_tokens": "integer",
    "total_tokens": "integer"
  }
}

Error Responses:

  • 400 Bad Request: Invalid parameters
  • 413 Payload Too Large: Input too long
  • 429 Too Many Requests: Rate limit exceeded
  • 500 Internal Server Error: Model error

Chat Completion

POST /chat

Chat completion endpoint similar to OpenAI's API.

Request Body:

{
  "messages": [
    {
      "role": "string",
      "content": "string"
    }
  ],
  "model_id": "string | null",
  "stream": "boolean",
  "max_length": "integer | null",
  "temperature": "float",
  "top_p": "float"
}

Response:

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "string"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": "integer",
    "completion_tokens": "integer",
    "total_tokens": "integer"
  }
}

Model Management

POST /models/load

Load a specific model.

Request Body:

{
  "model_id": "string"
}

GET /models/current

Get information about the currently loaded model.

GET /models/available

List all available models in the registry.

System Information

GET /system/info

Get detailed system information.

GET /health

Check the health status of the server.

Error Handling

All endpoints return appropriate HTTP status codes:

  • 200: Success
  • 400: Bad Request
  • 404: Not Found
  • 500: Internal Server Error

Error responses include a detail message:

{
  "detail": "Error message describing what went wrong"
}

Rate Limiting

  • 60 requests per minute
  • Burst size of 10 requests

Related Documentation


Made with ❤️ by Utkarsh Tiwari
GitHub: UtkarshTheDev | Twitter: @UtkarshTheDev | LinkedIn: utkarshthedev