update-llm-model-list
Audit and update the supported LLM model list in assets.py against litellm's registry (models.litellm.ai). Use when adding new models, pruning outdated ones, or verifying the list is correct.
Best use case
update-llm-model-list is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Audit and update the supported LLM model list in assets.py against litellm's registry (models.litellm.ai). Use when adding new models, pruning outdated ones, or verifying the list is correct.
Teams using update-llm-model-list should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/update-llm-model-list/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How update-llm-model-list Compares
| Feature / Agent | update-llm-model-list | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Audit and update the supported LLM model list in assets.py against litellm's registry (models.litellm.ai). Use when adding new models, pruning outdated ones, or verifying the list is correct.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Update LLM Model List
## Overview
The canonical model list lives in `sdk/agenta/sdk/assets.py` → `supported_llm_models`.
It drives the model dropdown in the playground, cost metadata, and the `model_to_provider_mapping`.
The authoritative external source is **`litellm.model_cost`** (2 600+ entries), which mirrors
<https://models.litellm.ai/>.
A pytest guard lives at:
`sdk/oss/tests/pytest/unit/test_supported_llm_models.py`
---
## Key rules
1. **Every model must exist in `litellm.model_cost`** (direct key, or with provider prefix stripped).
- `anthropic/claude-*` → litellm stores as `claude-*` (prefix is intentional for routing, stripped for cost lookup)
- `cohere/command-*` → litellm stores as `command-*`
- All other providers keep their full prefix (e.g. `gemini/`, `groq/`, `together_ai/`)
2. **Provider key** (`"anthropic"`, `"gemini"`, …) must match the Secrets API enum in
`api/oss/src/core/secrets/enums.py` (`StandardProviderKind`).
3. **No duplicates** within a provider list.
---
## Step 1 — Check which current models are outdated / wrong
Run this with `uvx` (no local install needed):
```bash
cat > /tmp/check_agenta_models.py << 'SCRIPT'
# /// script
# requires-python = ">=3.11"
# dependencies = ["litellm"]
# ///
import litellm, sys
# paste supported_llm_models here or import it
from agenta.sdk.assets import supported_llm_models
mc = set(litellm.model_cost.keys())
def exists(m):
if m in mc: return True
if "/" in m and m.split("/", 1)[1] in mc: return True
return False
fails = []
for provider, models in supported_llm_models.items():
for model in models:
if not exists(model):
fails.append((provider, model))
total = sum(len(v) for v in supported_llm_models.values())
print(f"Total models checked: {total}")
if fails:
for p, m in fails:
print(f" MISSING [{p}] {m}")
sys.exit(1)
else:
print("All models valid ✓")
SCRIPT
uvx --with litellm python /tmp/check_agenta_models.py 2>/dev/null
```
Alternatively, run the pytest unit test directly (requires agenta installed):
```bash
pytest sdk/oss/tests/pytest/unit/test_supported_llm_models.py -v
```
---
## Step 2 — Find models missing from Agenta (big-3 audit)
This script finds models in litellm that Agenta doesn't list yet, filtered to remove
noise (audio, video, embeddings, codex, snapshots):
```bash
cat > /tmp/find_missing.py << 'SCRIPT'
# /// script
# requires-python = ">=3.11"
# dependencies = ["litellm"]
# ///
import litellm, re
AGENTA_ANTHROPIC = set() # fill from assets.py (bare names, no prefix)
AGENTA_OPENAI = set() # fill from assets.py
AGENTA_GEMINI = set() # fill from assets.py (with gemini/ prefix)
mc = set(litellm.model_cost.keys())
NOISE = [
"audio","tts","speech","whisper","transcri","realtime","diarize",
"dall-e","image","video","veo","embed","moderat","search",
"babbage","davinci","ada","instruct","codex","computer-use",
"robotics","learnlm","gemma","live","v1:0",
]
KEEP = {"gpt-4o","gpt-4o-mini"}
DATED = re.compile(r"-\d{4}-\d{2}-\d{2}$")
EXP = re.compile(r"exp-\d{4}|\d{2}-\d{2}$")
def noise(m):
if m in KEEP: return False
return any(kw in m.lower() for kw in NOISE)
def dated(m):
return bool(DATED.search(m)) or bool(EXP.search(m))
def report(label, candidates, known, prefix=""):
print(f"\n=== {label} ===")
for m in sorted(candidates):
bare = m[len(prefix):] if prefix else m
if bare in known or m in known: continue
tag = "[dated/exp]" if dated(m) else "[alias]" if m.endswith("-latest") else "*** MISSING ***"
print(f" {m} {tag}")
# Anthropic
report("ANTHROPIC", [m for m in mc if m.startswith("claude-") and not noise(m)],
AGENTA_ANTHROPIC)
# OpenAI (no slash, starts with gpt- / o1 / o3 / o4)
OAI = [m for m in mc if any(m.startswith(p) for p in ("gpt-","o1","o3","o4","chatgpt"))
and "/" not in m and not noise(m)]
report("OPENAI", OAI, AGENTA_OPENAI)
# Gemini
report("GEMINI", [m for m in mc if m.startswith("gemini/") and not noise(m)],
AGENTA_GEMINI, prefix="gemini/")
SCRIPT
uvx --with litellm python /tmp/find_missing.py 2>/dev/null
```
**Fill in the `AGENTA_*` sets from the current `assets.py`** before running.
---
## Step 3 — Edit `assets.py`
File: `sdk/agenta/sdk/assets.py`
- Add models inside the correct provider list, newest first.
- For **Gemini 1.5** models (still widely used): add under `"gemini"`.
- For **OpenAI o-series pro tiers** (`o1-pro`, `o3-pro`): add after their base model.
- For **Groq**: always cross-check `litellm.groq_models` — Groq rotates its model catalogue frequently.
- For **DeepInfra / Together AI**: check `litellm.deepinfra_models` / `litellm.together_ai_models` for current names.
### Provider prefix conventions
| Provider key | Agenta prefix | litellm cost key prefix |
|---|---|---|
| `anthropic` | `anthropic/` | `claude-` (no prefix) |
| `cohere` | `cohere/` | `command-` (no prefix) |
| `gemini` | `gemini/` | `gemini/` |
| `groq` | `groq/` | `groq/` |
| `mistral` | `mistral/` | `mistral/` |
| `openai` | _(none)_ | _(none)_ |
| `openrouter` | `openrouter/` | `openrouter/` |
| `perplexityai` | `perplexity/` | `perplexity/` |
| `together_ai` | `together_ai/` | `together_ai/` |
| `deepinfra` | `deepinfra/` | `deepinfra/` |
---
## Step 4 — Run ruff then the test
```bash
# Format + lint
uvx --from ruff==0.14.0 ruff format sdk/agenta/sdk/assets.py
uvx --from ruff==0.14.0 ruff check --fix sdk/agenta/sdk/assets.py
# Validate all models against litellm (no agenta install needed)
uvx --with litellm python /tmp/check_agenta_models.py 2>/dev/null
```
All checks must pass before committing.
---
## Related files
| File | Purpose |
|---|---|
| `sdk/agenta/sdk/assets.py` | Canonical model list + cost metadata builder |
| `sdk/oss/tests/pytest/unit/test_supported_llm_models.py` | Pytest guard (parametrized per model) |
| `api/oss/src/core/secrets/enums.py` | Provider keys — must stay in sync |
| `api/oss/src/resources/evaluators/evaluators.py` | Separate (shorter) model list for evaluator dropdown |Related Skills
accessibility-design-checklist
Эксперт по accessibility дизайну. Используй для WCAG, a11y чеклистов и inclusive design.
vllm-ascend-model-adapter
Adapt and debug existing or new models for vLLM on Ascend NPU. Implement in /vllm-workspace/vllm and /vllm-workspace/vllm-ascend, validate via direct vllm serve from /workspace, and deliver one signed commit in the current repo.
update-screenshots
Download screenshot baselines from the latest CI run and commit them. Use when asked to update, accept, or refresh component screenshot baselines from CI, or after the screenshot-test GitHub Action reports differences. This skill should be run as a subagent.
update-instruction
Create, update, or manage universal-ai-config instruction templates. Handles finding existing instructions, deciding whether to create or modify, and writing the template.
update-google-agent-models
Fast-path Google/Gemini-only agent chain update. Use when user says "Update Gemini Agent Models", "Update Gemnini Agent Models", or "Update Google Agent Models".
threat-modeling
Conduct structured threat modeling for software systems using established methodologies to identify, prioritize, and mitigate security threats before they are exploited.
threat-modeling-expert
Expert in threat modeling methodologies, security architecture review, and risk assessment. Masters STRIDE, PASTA, attack trees, and security requirement extraction. Use for security architecture r...
threat-model
Threat modeling methodology and risk assessment process. Use when designing new features, reviewing architecture for security, performing STRIDE analysis, creating attack trees, or assessing risk with CVSS/DREAD. Also use when authentication/authorization is added, data flows cross trust boundaries, third-party integrations are introduced, sensitive data handling changes, or analyzing security incidents. Essential for data flow diagrams and security design reviews.
projecoes-read-models
Use para criar projeções como 9BOX, dashboards e visões de leitura otimizadas para decisão.
orcaflex-model-generator
Generate OrcaFlex models from templates using component assembly with lookup tables for vessels, risers, materials, and environments.
multi-model-reviewer
協調多個 AI 模型(ChatGPT、Gemini、Codex、QWEN、Claude)進行三角驗證,確保「Specification == Program == Test」一致性。過濾假警報後輸出報告,大幅減少人工介入時間。
modelscope
Use this skill to generate AI images using ModelScope's Tongyi-MAI/Z-Image-Turbo model. Simply describe the image you want and it will be generated. Supports Chinese and English prompts.