Skip to content

Support Model Aliases #557

@salmanap

Description

@salmanap

Today your app passes a concrete vendor model (e.g., openai/gpt-4o-mini) to archgw. That couples code to vendors and versions; swapping models becomes a codewide search/replace. Introduce stable, task-level aliases (e.g., arch.summarize.v1). Your code calls the alias; archgw maps it to an underlying model in config.

Benefits

  • Decoupled application code. Instead of hardcoding gpt-4o-mini or claude-3-sonnet throughout your codebase, you reference a single alias. Swap the underlying model once in config, and your entire app updates—no redeploys, no hunt-and-replace.
  • Safe promotions. Want to test a new model? Point the alias at a candidate behind the scenes. If your canary tests or metrics hold, flip the pointer to 100% traffic. If not, roll back instantly without touching application code.
  • Central governance. Apply SLOs, guardrails, and policies at the alias level. For example, enforce JSON schema validation, timeout rules, or max token caps once, instead of scattering that logic across services.
  • Observability by task. Dashboards and traces group naturally by alias (arch.summarize.v1) rather than shifting provider names. That makes it easier to track latency, cost, and error rates by intent, not by vendor string.
  • Quota & cost control. Throttle or cap usage per alias. For instance, you can enforce a daily budget on summarization tasks, regardless of which model powers them underneath.

Revised config (with aliases)


listeners:
  egress_traffic:
    address: 0.0.0.0
    port: 12000
    message_format: openai
    timeout: 30s

llm_providers:
  - name: 4o-mini
    model: openai/gpt-4o-mini
    access_key: $OPENAI_API_KEY
    default: true

  - name: gpt-o3
    model: openai/o3
    access_key: $OPENAI_API_KEY

model_aliases:
  # Task-level alias -> current best target
  arch.summarize.v1:
    target: 4o-mini

  arch.reason.v1:
    target: gpt-o3

  # (Optional) staged rollout example for your next version
  # arch.summarize.v2:
  #   rollout:
  #     - weight: 80
  #       target: 4o-mini
  #     - weight: 20
  #       target: gpt-o3
  #   fallback:
  #     - target: 4o-mini     # if upstream fails or times out

How to use (client-side)

Before: model: "openai/gpt-4o-mini"
After: model: "arch.summarize.v1"

Now you can change providers/versions by editing this config only—no code changes, no mass string edits.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions