-
Notifications
You must be signed in to change notification settings - Fork 198
Open
Description
Today your app passes a concrete vendor model (e.g., openai/gpt-4o-mini) to archgw. That couples code to vendors and versions; swapping models becomes a codewide search/replace. Introduce stable, task-level aliases (e.g., arch.summarize.v1). Your code calls the alias; archgw maps it to an underlying model in config.
Benefits
- Decoupled application code. Instead of hardcoding gpt-4o-mini or claude-3-sonnet throughout your codebase, you reference a single alias. Swap the underlying model once in config, and your entire app updates—no redeploys, no hunt-and-replace.
- Safe promotions. Want to test a new model? Point the alias at a candidate behind the scenes. If your canary tests or metrics hold, flip the pointer to 100% traffic. If not, roll back instantly without touching application code.
- Central governance. Apply SLOs, guardrails, and policies at the alias level. For example, enforce JSON schema validation, timeout rules, or max token caps once, instead of scattering that logic across services.
- Observability by task. Dashboards and traces group naturally by alias (arch.summarize.v1) rather than shifting provider names. That makes it easier to track latency, cost, and error rates by intent, not by vendor string.
- Quota & cost control. Throttle or cap usage per alias. For instance, you can enforce a daily budget on summarization tasks, regardless of which model powers them underneath.
Revised config (with aliases)
listeners:
egress_traffic:
address: 0.0.0.0
port: 12000
message_format: openai
timeout: 30s
llm_providers:
- name: 4o-mini
model: openai/gpt-4o-mini
access_key: $OPENAI_API_KEY
default: true
- name: gpt-o3
model: openai/o3
access_key: $OPENAI_API_KEY
model_aliases:
# Task-level alias -> current best target
arch.summarize.v1:
target: 4o-mini
arch.reason.v1:
target: gpt-o3
# (Optional) staged rollout example for your next version
# arch.summarize.v2:
# rollout:
# - weight: 80
# target: 4o-mini
# - weight: 20
# target: gpt-o3
# fallback:
# - target: 4o-mini # if upstream fails or times out
How to use (client-side)
Before: model: "openai/gpt-4o-mini"
After: model: "arch.summarize.v1"
Now you can change providers/versions by editing this config only—no code changes, no mass string edits.
darkdatter and salmanap
Metadata
Metadata
Assignees
Labels
No labels