Skip to content

Latest commit

 

History

History

langgraph-mistral

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

LangGraph Agent with Ministral 8B

This project implements a LangGraph search agent powered by the Ministral 8B language model.

Overview

  • AI Agent: Utilizes LangGraph and Ministral 8B to build an AI agent that can search the web.
  • API: Provides a REST API for easy integration.
  • Flexible Invocation: Supports both synchronous and asynchronous (queue-based) interactions.
  • Deployment Options: Run locally or deploy to BentoCloud for scalability.

Getting Started

This project is a reference implementation designed to be hackable. Download the source code and use it as a playground to build your own agent APIs:

git clone https://github.com/bentoml/BentoLangGraph.git
cd BentoLangGraph/langgraph-mistral

Run locally

Install dependencies

pip install -r requirements.txt

Set HuggingFace API Key for downloading the model:

export HF_TOKEN=<your-api-key>

Spin up the REST API server:

bentoml serve .

Invoke the endpoint

Invoke with Python API client auto-generated by BentoML:

import bentoml
client = bentoml.SyncHTTPClient("http://localhost:3000")
response = client.invoke("What is the weather in San Francisco today?")
print(response)

Invoke with CURL:

curl -X POST http://localhost:3000/invoke \
   -H 'Content-Type: application/json' \
   -d '{"query": "what is the weather in San Francisco today?"}'

Example Output:

 The weather in San Francisco today is mostly cloudy with a low around 57 degrees and a high of 69 degrees. There is a chance of rain later in the day.

Invoke via Async Queue

Submit task to queue:

$ curl -X POST http://localhost:3000/invoke/submit \
   -H 'Content-Type: application/json' \
   -d '{"query": "what is the weather in San Francisco today?"}'

{"task_id":"b1fe7960470740ac9be58dcf740ee587","status":"in_progress"}

Check status of task:

$ curl -s http://localhost:3000/invoke/status?task_id=$TASK_ID

{"task_id":"40451e21a6834c279d78433c5e1a4083","status":"success",
"created_at":"2024-09-23T05:09:36","executed_at":"2024-09-23T05:09:36"}

Get result of task:

$ curl -s http://localhost:3000/invoke/get?task_id=$TASK_ID

{"task_id":"40451e21a6834c279d78433c5e1a4083","status":"success",
"created_at":"2024-09-23T05:09:36","executed_at":"2024-09-23T05:09:36"}

Development

Start development server that will auto reload when code changes:

bentoml serve . --reload

Inspect all event streams:

curl -X POST http://localhost:3000/stream \
   -H 'Content-Type: application/json' \
   -d '{"query": "what is the weather in San Francisco today?"}'

{'event': 'on_chain_start', 'data': ...}
{'event': 'on_chain_end', 'data': ...}
{'event': 'on_tool_start', 'data': ...}
{'event': 'on_tool_end', 'data': ...}
...

Deploy to BentoCloud

Login to BentoCloud:

pip install bentoml
bentoml cloud login

Create secret:

bentoml secret create huggingface HF_TOKEN=$HF_TOKEN

Deploy:

bentoml deploy . --name search-agent --secret huggingface

Invoke the endpoint:

DEPLOYED_ENDPOINT=$(bentoml deployment get search-agent -o json | jq -r ".endpoint_urls[0]")

python client.py --query "What's the weather in San Francisco today?" --url $DEPLOYED_ENDPOINT