update-llm-model-list

Audit and update the supported LLM model list in assets.py against litellm's registry (models.litellm.ai). Use when adding new models, pruning outdated ones, or verifying the list is correct.

16 stars

Best use case

update-llm-model-list is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Audit and update the supported LLM model list in assets.py against litellm's registry (models.litellm.ai). Use when adding new models, pruning outdated ones, or verifying the list is correct.

Teams using update-llm-model-list should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/update-llm-model-list/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/data-ai/update-llm-model-list/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/update-llm-model-list/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How update-llm-model-list Compares

Feature / Agentupdate-llm-model-listStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Audit and update the supported LLM model list in assets.py against litellm's registry (models.litellm.ai). Use when adding new models, pruning outdated ones, or verifying the list is correct.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Update LLM Model List

## Overview

The canonical model list lives in `sdk/agenta/sdk/assets.py` → `supported_llm_models`.
It drives the model dropdown in the playground, cost metadata, and the `model_to_provider_mapping`.

The authoritative external source is **`litellm.model_cost`** (2 600+ entries), which mirrors
<https://models.litellm.ai/>.

A pytest guard lives at:
`sdk/oss/tests/pytest/unit/test_supported_llm_models.py`

---

## Key rules

1. **Every model must exist in `litellm.model_cost`** (direct key, or with provider prefix stripped).
   - `anthropic/claude-*` → litellm stores as `claude-*` (prefix is intentional for routing, stripped for cost lookup)
   - `cohere/command-*` → litellm stores as `command-*`
   - All other providers keep their full prefix (e.g. `gemini/`, `groq/`, `together_ai/`)
2. **Provider key** (`"anthropic"`, `"gemini"`, …) must match the Secrets API enum in
   `api/oss/src/core/secrets/enums.py` (`StandardProviderKind`).
3. **No duplicates** within a provider list.

---

## Step 1 — Check which current models are outdated / wrong

Run this with `uvx` (no local install needed):

```bash
cat > /tmp/check_agenta_models.py << 'SCRIPT'
# /// script
# requires-python = ">=3.11"
# dependencies = ["litellm"]
# ///
import litellm, sys

# paste supported_llm_models here or import it
from agenta.sdk.assets import supported_llm_models

mc = set(litellm.model_cost.keys())

def exists(m):
    if m in mc: return True
    if "/" in m and m.split("/", 1)[1] in mc: return True
    return False

fails = []
for provider, models in supported_llm_models.items():
    for model in models:
        if not exists(model):
            fails.append((provider, model))

total = sum(len(v) for v in supported_llm_models.values())
print(f"Total models checked: {total}")
if fails:
    for p, m in fails:
        print(f"  MISSING [{p}] {m}")
    sys.exit(1)
else:
    print("All models valid ✓")
SCRIPT
uvx --with litellm python /tmp/check_agenta_models.py 2>/dev/null
```

Alternatively, run the pytest unit test directly (requires agenta installed):

```bash
pytest sdk/oss/tests/pytest/unit/test_supported_llm_models.py -v
```

---

## Step 2 — Find models missing from Agenta (big-3 audit)

This script finds models in litellm that Agenta doesn't list yet, filtered to remove
noise (audio, video, embeddings, codex, snapshots):

```bash
cat > /tmp/find_missing.py << 'SCRIPT'
# /// script
# requires-python = ">=3.11"
# dependencies = ["litellm"]
# ///
import litellm, re

AGENTA_ANTHROPIC = set()   # fill from assets.py (bare names, no prefix)
AGENTA_OPENAI    = set()   # fill from assets.py
AGENTA_GEMINI    = set()   # fill from assets.py (with gemini/ prefix)

mc = set(litellm.model_cost.keys())

NOISE = [
    "audio","tts","speech","whisper","transcri","realtime","diarize",
    "dall-e","image","video","veo","embed","moderat","search",
    "babbage","davinci","ada","instruct","codex","computer-use",
    "robotics","learnlm","gemma","live","v1:0",
]
KEEP = {"gpt-4o","gpt-4o-mini"}
DATED = re.compile(r"-\d{4}-\d{2}-\d{2}$")
EXP   = re.compile(r"exp-\d{4}|\d{2}-\d{2}$")

def noise(m):
    if m in KEEP: return False
    return any(kw in m.lower() for kw in NOISE)

def dated(m):
    return bool(DATED.search(m)) or bool(EXP.search(m))

def report(label, candidates, known, prefix=""):
    print(f"\n=== {label} ===")
    for m in sorted(candidates):
        bare = m[len(prefix):] if prefix else m
        if bare in known or m in known: continue
        tag = "[dated/exp]" if dated(m) else "[alias]" if m.endswith("-latest") else "*** MISSING ***"
        print(f"  {m}  {tag}")

# Anthropic
report("ANTHROPIC", [m for m in mc if m.startswith("claude-") and not noise(m)],
       AGENTA_ANTHROPIC)

# OpenAI (no slash, starts with gpt- / o1 / o3 / o4)
OAI = [m for m in mc if any(m.startswith(p) for p in ("gpt-","o1","o3","o4","chatgpt"))
       and "/" not in m and not noise(m)]
report("OPENAI", OAI, AGENTA_OPENAI)

# Gemini
report("GEMINI", [m for m in mc if m.startswith("gemini/") and not noise(m)],
       AGENTA_GEMINI, prefix="gemini/")
SCRIPT
uvx --with litellm python /tmp/find_missing.py 2>/dev/null
```

**Fill in the `AGENTA_*` sets from the current `assets.py`** before running.

---

## Step 3 — Edit `assets.py`

File: `sdk/agenta/sdk/assets.py`

- Add models inside the correct provider list, newest first.
- For **Gemini 1.5** models (still widely used): add under `"gemini"`.
- For **OpenAI o-series pro tiers** (`o1-pro`, `o3-pro`): add after their base model.
- For **Groq**: always cross-check `litellm.groq_models` — Groq rotates its model catalogue frequently.
- For **DeepInfra / Together AI**: check `litellm.deepinfra_models` / `litellm.together_ai_models` for current names.

### Provider prefix conventions

| Provider key | Agenta prefix | litellm cost key prefix |
|---|---|---|
| `anthropic` | `anthropic/` | `claude-` (no prefix) |
| `cohere` | `cohere/` | `command-` (no prefix) |
| `gemini` | `gemini/` | `gemini/` |
| `groq` | `groq/` | `groq/` |
| `mistral` | `mistral/` | `mistral/` |
| `openai` | _(none)_ | _(none)_ |
| `openrouter` | `openrouter/` | `openrouter/` |
| `perplexityai` | `perplexity/` | `perplexity/` |
| `together_ai` | `together_ai/` | `together_ai/` |
| `deepinfra` | `deepinfra/` | `deepinfra/` |

---

## Step 4 — Run ruff then the test

```bash
# Format + lint
uvx --from ruff==0.14.0 ruff format sdk/agenta/sdk/assets.py
uvx --from ruff==0.14.0 ruff check --fix sdk/agenta/sdk/assets.py

# Validate all models against litellm (no agenta install needed)
uvx --with litellm python /tmp/check_agenta_models.py 2>/dev/null
```

All checks must pass before committing.

---

## Related files

| File | Purpose |
|---|---|
| `sdk/agenta/sdk/assets.py` | Canonical model list + cost metadata builder |
| `sdk/oss/tests/pytest/unit/test_supported_llm_models.py` | Pytest guard (parametrized per model) |
| `api/oss/src/core/secrets/enums.py` | Provider keys — must stay in sync |
| `api/oss/src/resources/evaluators/evaluators.py` | Separate (shorter) model list for evaluator dropdown |

Related Skills

accessibility-design-checklist

16
from diegosouzapw/awesome-omni-skill

Эксперт по accessibility дизайну. Используй для WCAG, a11y чеклистов и inclusive design.

vllm-ascend-model-adapter

16
from diegosouzapw/awesome-omni-skill

Adapt and debug existing or new models for vLLM on Ascend NPU. Implement in /vllm-workspace/vllm and /vllm-workspace/vllm-ascend, validate via direct vllm serve from /workspace, and deliver one signed commit in the current repo.

update-screenshots

16
from diegosouzapw/awesome-omni-skill

Download screenshot baselines from the latest CI run and commit them. Use when asked to update, accept, or refresh component screenshot baselines from CI, or after the screenshot-test GitHub Action reports differences. This skill should be run as a subagent.

update-instruction

16
from diegosouzapw/awesome-omni-skill

Create, update, or manage universal-ai-config instruction templates. Handles finding existing instructions, deciding whether to create or modify, and writing the template.

update-google-agent-models

16
from diegosouzapw/awesome-omni-skill

Fast-path Google/Gemini-only agent chain update. Use when user says "Update Gemini Agent Models", "Update Gemnini Agent Models", or "Update Google Agent Models".

threat-modeling

16
from diegosouzapw/awesome-omni-skill

Conduct structured threat modeling for software systems using established methodologies to identify, prioritize, and mitigate security threats before they are exploited.

threat-modeling-expert

16
from diegosouzapw/awesome-omni-skill

Expert in threat modeling methodologies, security architecture review, and risk assessment. Masters STRIDE, PASTA, attack trees, and security requirement extraction. Use for security architecture r...

threat-model

16
from diegosouzapw/awesome-omni-skill

Threat modeling methodology and risk assessment process. Use when designing new features, reviewing architecture for security, performing STRIDE analysis, creating attack trees, or assessing risk with CVSS/DREAD. Also use when authentication/authorization is added, data flows cross trust boundaries, third-party integrations are introduced, sensitive data handling changes, or analyzing security incidents. Essential for data flow diagrams and security design reviews.

projecoes-read-models

16
from diegosouzapw/awesome-omni-skill

Use para criar projeções como 9BOX, dashboards e visões de leitura otimizadas para decisão.

orcaflex-model-generator

16
from diegosouzapw/awesome-omni-skill

Generate OrcaFlex models from templates using component assembly with lookup tables for vessels, risers, materials, and environments.

multi-model-reviewer

16
from diegosouzapw/awesome-omni-skill

協調多個 AI 模型(ChatGPT、Gemini、Codex、QWEN、Claude)進行三角驗證,確保「Specification == Program == Test」一致性。過濾假警報後輸出報告,大幅減少人工介入時間。

modelscope

16
from diegosouzapw/awesome-omni-skill

Use this skill to generate AI images using ModelScope's Tongyi-MAI/Z-Image-Turbo model. Simply describe the image you want and it will be generated. Supports Chinese and English prompts.