model-selection
Automatically applies when choosing LLM models and providers. Ensures proper model comparison, provider selection, cost optimization, fallback patterns, and multi-model strategies.
Best use case
model-selection is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Automatically applies when choosing LLM models and providers. Ensures proper model comparison, provider selection, cost optimization, fallback patterns, and multi-model strategies.
Teams using model-selection should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/model-selection-majiayu000/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How model-selection Compares
| Feature / Agent | model-selection | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Automatically applies when choosing LLM models and providers. Ensures proper model comparison, provider selection, cost optimization, fallback patterns, and multi-model strategies.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Model Selection and Provider Management
When selecting LLM models and managing providers, follow these patterns for optimal cost, performance, and reliability.
**Trigger Keywords**: model selection, provider, model comparison, fallback, OpenAI, Anthropic, model routing, cost optimization, model capabilities, provider failover
**Agent Integration**: Used by `ml-system-architect`, `performance-and-cost-engineer-llm`, `llm-app-engineer`
## ✅ Correct Pattern: Model Registry
```python
from typing import Optional, Dict, List
from pydantic import BaseModel, Field
from enum import Enum
class ModelProvider(str, Enum):
"""Supported LLM providers."""
ANTHROPIC = "anthropic"
OPENAI = "openai"
GOOGLE = "google"
LOCAL = "local"
class ModelCapabilities(BaseModel):
"""Model capabilities and constraints."""
max_context_tokens: int
max_output_tokens: int
supports_streaming: bool = True
supports_function_calling: bool = False
supports_vision: bool = False
supports_json_mode: bool = False
class ModelPricing(BaseModel):
"""Model pricing information."""
input_price_per_mtok: float # USD per million tokens
output_price_per_mtok: float
cache_write_price_per_mtok: Optional[float] = None
cache_read_price_per_mtok: Optional[float] = None
class ModelConfig(BaseModel):
"""Complete model configuration."""
id: str
name: str
provider: ModelProvider
capabilities: ModelCapabilities
pricing: ModelPricing
recommended_use_cases: List[str] = Field(default_factory=list)
quality_tier: str # "flagship", "balanced", "fast"
class ModelRegistry:
"""Registry of available models with metadata."""
def __init__(self):
self.models: Dict[str, ModelConfig] = {}
self._register_default_models()
def _register_default_models(self):
"""Register commonly used models."""
# Claude models
self.register(ModelConfig(
id="claude-sonnet-4-20250514",
name="Claude Sonnet 4",
provider=ModelProvider.ANTHROPIC,
capabilities=ModelCapabilities(
max_context_tokens=200_000,
max_output_tokens=8_192,
supports_streaming=True,
supports_vision=True,
supports_json_mode=True
),
pricing=ModelPricing(
input_price_per_mtok=3.00,
output_price_per_mtok=15.00,
cache_write_price_per_mtok=3.75,
cache_read_price_per_mtok=0.30
),
recommended_use_cases=[
"complex reasoning",
"long context",
"code generation"
],
quality_tier="flagship"
))
self.register(ModelConfig(
id="claude-haiku-3-5-20250514",
name="Claude Haiku 3.5",
provider=ModelProvider.ANTHROPIC,
capabilities=ModelCapabilities(
max_context_tokens=200_000,
max_output_tokens=8_192,
supports_streaming=True,
supports_vision=True
),
pricing=ModelPricing(
input_price_per_mtok=0.80,
output_price_per_mtok=4.00
),
recommended_use_cases=[
"high throughput",
"cost-sensitive",
"simple tasks"
],
quality_tier="fast"
))
# OpenAI models
self.register(ModelConfig(
id="gpt-4-turbo",
name="GPT-4 Turbo",
provider=ModelProvider.OPENAI,
capabilities=ModelCapabilities(
max_context_tokens=128_000,
max_output_tokens=4_096,
supports_streaming=True,
supports_function_calling=True,
supports_vision=True,
supports_json_mode=True
),
pricing=ModelPricing(
input_price_per_mtok=10.00,
output_price_per_mtok=30.00
),
recommended_use_cases=[
"function calling",
"complex reasoning",
"structured output"
],
quality_tier="flagship"
))
def register(self, model: ModelConfig):
"""Register a model."""
self.models[model.id] = model
def get(self, model_id: str) -> Optional[ModelConfig]:
"""Get model by ID."""
return self.models.get(model_id)
def find_by_criteria(
self,
max_cost_per_mtok: Optional[float] = None,
min_context_tokens: Optional[int] = None,
requires_streaming: bool = False,
requires_vision: bool = False,
quality_tier: Optional[str] = None,
provider: Optional[ModelProvider] = None
) -> List[ModelConfig]:
"""
Find models matching criteria.
Args:
max_cost_per_mtok: Maximum input cost
min_context_tokens: Minimum context window
requires_streaming: Must support streaming
requires_vision: Must support vision
quality_tier: Quality tier filter
provider: Provider filter
Returns:
List of matching models
"""
matches = []
for model in self.models.values():
# Check cost
if max_cost_per_mtok is not None:
if model.pricing.input_price_per_mtok > max_cost_per_mtok:
continue
# Check context
if min_context_tokens is not None:
if model.capabilities.max_context_tokens < min_context_tokens:
continue
# Check capabilities
if requires_streaming and not model.capabilities.supports_streaming:
continue
if requires_vision and not model.capabilities.supports_vision:
continue
# Check tier
if quality_tier and model.quality_tier != quality_tier:
continue
# Check provider
if provider and model.provider != provider:
continue
matches.append(model)
# Sort by cost (cheapest first)
matches.sort(key=lambda m: m.pricing.input_price_per_mtok)
return matches
# Usage
registry = ModelRegistry()
# Find cost-effective models with vision
models = registry.find_by_criteria(
max_cost_per_mtok=5.00,
requires_vision=True
)
for model in models:
print(f"{model.name}: ${model.pricing.input_price_per_mtok}/MTok")
```
## Model Router
```python
import asyncio
from typing import Callable, Optional
class ModelRouter:
"""Route requests to appropriate models based on task."""
def __init__(self, registry: ModelRegistry):
self.registry = registry
self.routing_rules = []
def add_rule(
self,
name: str,
condition: Callable[[str], bool],
model_id: str,
priority: int = 0
):
"""
Add routing rule.
Args:
name: Rule name
condition: Function that checks if rule applies
model_id: Model to route to
priority: Higher priority rules checked first
"""
self.routing_rules.append({
"name": name,
"condition": condition,
"model_id": model_id,
"priority": priority
})
# Sort by priority
self.routing_rules.sort(key=lambda r: r["priority"], reverse=True)
def route(self, prompt: str) -> str:
"""
Determine which model to use for prompt.
Args:
prompt: Input prompt
Returns:
Model ID to use
"""
for rule in self.routing_rules:
if rule["condition"](prompt):
return rule["model_id"]
# Default fallback
return "claude-sonnet-4-20250514"
# Example routing rules
router = ModelRouter(registry)
# Route simple tasks to fast model
router.add_rule(
name="simple_tasks",
condition=lambda p: len(p) < 100 and "?" in p,
model_id="claude-haiku-3-5-20250514",
priority=1
)
# Route code to Sonnet
router.add_rule(
name="code_tasks",
condition=lambda p: any(kw in p.lower() for kw in ["code", "function", "class"]),
model_id="claude-sonnet-4-20250514",
priority=2
)
# Route long context to Claude
router.add_rule(
name="long_context",
condition=lambda p: len(p) > 50_000,
model_id="claude-sonnet-4-20250514",
priority=3
)
# Use router
prompt = "Write a Python function to sort a list"
model_id = router.route(prompt)
print(f"Using model: {model_id}")
```
## Fallback Chain
```python
from typing import List, Optional
import logging
logger = logging.getLogger(__name__)
class FallbackChain:
"""Implement fallback chain for reliability."""
def __init__(
self,
primary_model: str,
fallback_models: List[str],
registry: ModelRegistry
):
self.primary_model = primary_model
self.fallback_models = fallback_models
self.registry = registry
async def complete_with_fallback(
self,
prompt: str,
clients: Dict[str, any],
**kwargs
) -> Dict[str, any]:
"""
Try primary model, fallback on failure.
Args:
prompt: Input prompt
clients: Dict mapping provider to client
**kwargs: Additional model parameters
Returns:
Dict with response and metadata
"""
models_to_try = [self.primary_model] + self.fallback_models
last_error = None
for model_id in models_to_try:
model_config = self.registry.get(model_id)
if not model_config:
logger.warning(f"Model {model_id} not in registry")
continue
try:
logger.info(f"Attempting request with {model_id}")
# Get client for provider
client = clients.get(model_config.provider.value)
if not client:
logger.warning(f"No client for {model_config.provider}")
continue
# Make request
response = await client.complete(
prompt,
model=model_id,
**kwargs
)
return {
"response": response,
"model_used": model_id,
"fallback_occurred": model_id != self.primary_model
}
except Exception as e:
logger.error(f"Request failed for {model_id}: {e}")
last_error = e
continue
# All models failed
raise Exception(f"All models failed. Last error: {last_error}")
# Usage
fallback_chain = FallbackChain(
primary_model="claude-sonnet-4-20250514",
fallback_models=[
"gpt-4-turbo",
"claude-haiku-3-5-20250514"
],
registry=registry
)
result = await fallback_chain.complete_with_fallback(
prompt="What is Python?",
clients={
"anthropic": anthropic_client,
"openai": openai_client
}
)
print(f"Response from: {result['model_used']}")
```
## Cost Optimization
```python
class CostOptimizer:
"""Optimize costs by selecting appropriate models."""
def __init__(self, registry: ModelRegistry):
self.registry = registry
def estimate_cost(
self,
model_id: str,
input_tokens: int,
output_tokens: int
) -> float:
"""
Estimate cost for request.
Args:
model_id: Model identifier
input_tokens: Input token count
output_tokens: Expected output tokens
Returns:
Estimated cost in USD
"""
model = self.registry.get(model_id)
if not model:
raise ValueError(f"Unknown model: {model_id}")
input_cost = (
input_tokens / 1_000_000
) * model.pricing.input_price_per_mtok
output_cost = (
output_tokens / 1_000_000
) * model.pricing.output_price_per_mtok
return input_cost + output_cost
def find_cheapest_model(
self,
input_tokens: int,
output_tokens: int,
quality_tier: Optional[str] = None,
**criteria
) -> tuple[str, float]:
"""
Find cheapest model meeting criteria.
Args:
input_tokens: Input token count
output_tokens: Expected output tokens
quality_tier: Optional quality requirement
**criteria: Additional model criteria
Returns:
Tuple of (model_id, estimated_cost)
"""
# Find matching models
candidates = self.registry.find_by_criteria(
quality_tier=quality_tier,
**criteria
)
if not candidates:
raise ValueError("No models match criteria")
# Calculate costs
costs = [
(
model.id,
self.estimate_cost(model.id, input_tokens, output_tokens)
)
for model in candidates
]
# Return cheapest
return min(costs, key=lambda x: x[1])
def batch_cost_analysis(
self,
requests: List[Dict[str, int]],
model_ids: List[str]
) -> Dict[str, Dict[str, float]]:
"""
Analyze costs for batch of requests across models.
Args:
requests: List of dicts with input_tokens, output_tokens
model_ids: Models to compare
Returns:
Cost breakdown per model
"""
analysis = {}
for model_id in model_ids:
total_cost = 0.0
total_tokens = 0
for req in requests:
cost = self.estimate_cost(
model_id,
req["input_tokens"],
req["output_tokens"]
)
total_cost += cost
total_tokens += req["input_tokens"] + req["output_tokens"]
analysis[model_id] = {
"total_cost_usd": total_cost,
"avg_cost_per_request": total_cost / len(requests),
"total_tokens": total_tokens,
"cost_per_1k_tokens": (total_cost / total_tokens) * 1000
}
return analysis
# Usage
optimizer = CostOptimizer(registry)
# Find cheapest model for task
model_id, cost = optimizer.find_cheapest_model(
input_tokens=1000,
output_tokens=500,
quality_tier="balanced"
)
print(f"Cheapest model: {model_id} (${cost:.4f})")
# Compare costs across models
requests = [
{"input_tokens": 1000, "output_tokens": 500},
{"input_tokens": 2000, "output_tokens": 1000},
]
cost_analysis = optimizer.batch_cost_analysis(
requests,
model_ids=[
"claude-sonnet-4-20250514",
"claude-haiku-3-5-20250514",
"gpt-4-turbo"
]
)
for model_id, stats in cost_analysis.items():
print(f"{model_id}: ${stats['total_cost_usd']:.4f}")
```
## Multi-Model Ensemble
```python
class ModelEnsemble:
"""Ensemble multiple models for improved results."""
def __init__(
self,
models: List[str],
voting_strategy: str = "majority"
):
self.models = models
self.voting_strategy = voting_strategy
async def complete_ensemble(
self,
prompt: str,
clients: Dict[str, any],
registry: ModelRegistry
) -> str:
"""
Get predictions from multiple models and combine.
Args:
prompt: Input prompt
clients: Provider clients
registry: Model registry
Returns:
Combined prediction
"""
# Get predictions from all models
tasks = []
for model_id in self.models:
model_config = registry.get(model_id)
client = clients.get(model_config.provider.value)
task = client.complete(prompt, model=model_id)
tasks.append(task)
predictions = await asyncio.gather(*tasks)
# Combine predictions
if self.voting_strategy == "majority":
return self._majority_vote(predictions)
elif self.voting_strategy == "longest":
return max(predictions, key=len)
elif self.voting_strategy == "first":
return predictions[0]
else:
raise ValueError(f"Unknown strategy: {self.voting_strategy}")
def _majority_vote(self, predictions: List[str]) -> str:
"""Select most common prediction."""
from collections import Counter
counter = Counter(predictions)
return counter.most_common(1)[0][0]
# Usage
ensemble = ModelEnsemble(
models=[
"claude-sonnet-4-20250514",
"gpt-4-turbo",
"claude-opus-4-20250514"
],
voting_strategy="majority"
)
result = await ensemble.complete_ensemble(
prompt="Is Python case-sensitive? Answer yes or no.",
clients=clients,
registry=registry
)
```
## ❌ Anti-Patterns
```python
# ❌ Hardcoded model ID everywhere
response = client.complete("prompt", model="claude-sonnet-4-20250514")
# ✅ Better: Use model registry and routing
model_id = router.route(prompt)
response = client.complete(prompt, model=model_id)
# ❌ No fallback on failure
try:
response = await primary_client.complete(prompt)
except:
raise # App crashes!
# ✅ Better: Implement fallback chain
result = await fallback_chain.complete_with_fallback(prompt, clients)
# ❌ Always using most expensive model
model = "claude-opus-4-20250514" # Always flagship!
# ✅ Better: Route based on task complexity
model_id = router.route(prompt) # Uses appropriate model
# ❌ No cost tracking
response = await client.complete(prompt) # No idea of cost!
# ✅ Better: Track and optimize costs
cost = optimizer.estimate_cost(model_id, input_tokens, output_tokens)
logger.info(f"Request cost: ${cost:.4f}")
```
## Best Practices Checklist
- ✅ Use model registry to centralize model metadata
- ✅ Implement routing to select appropriate models
- ✅ Set up fallback chains for reliability
- ✅ Track and optimize costs per model
- ✅ Consider quality tier for each use case
- ✅ Monitor model performance metrics
- ✅ Use cheaper models for simple tasks
- ✅ Cache model responses when appropriate
- ✅ Test fallback chains regularly
- ✅ Document model selection rationale
- ✅ Review costs regularly and optimize
- ✅ Keep model metadata up to date
## Auto-Apply
When selecting models:
1. Register models in ModelRegistry with capabilities and pricing
2. Implement ModelRouter for task-based routing
3. Set up FallbackChain for reliability
4. Use CostOptimizer to find cost-effective options
5. Track costs per model and request
6. Document routing logic and fallback strategy
7. Monitor model performance and costs
## Related Skills
- `llm-app-architecture` - For LLM integration
- `evaluation-metrics` - For model comparison
- `monitoring-alerting` - For tracking performance
- `performance-profiling` - For optimization
- `prompting-patterns` - For model-specific promptsRelated Skills
projecoes-read-models
Use para criar projeções como 9BOX, dashboards e visões de leitura otimizadas para decisão.
orcaflex-model-generator
Generate OrcaFlex models from templates using component assembly with lookup tables for vessels, risers, materials, and environments.
multi-model-reviewer
協調多個 AI 模型(ChatGPT、Gemini、Codex、QWEN、Claude)進行三角驗證,確保「Specification == Program == Test」一致性。過濾假警報後輸出報告,大幅減少人工介入時間。
modelscope
Use this skill to generate AI images using ModelScope's Tongyi-MAI/Z-Image-Turbo model. Simply describe the image you want and it will be generated. Supports Chinese and English prompts.
modelry-automation
Automate Modelry tasks via Rube MCP (Composio). Always search tools first for current schemas.
math-modeling
本技能应在用户要求"数学建模"、"建模比赛"、"数模论文"、"数学建模竞赛"、"建模分析"、"建模求解"或提及数学建模相关任务时使用。适用于全国大学生数学建模竞赛(CUMCM)、美国大学生数学建模竞赛(MCM/ICM)等各类数学建模比赛。
ios-foundation-models-diag
Use when debugging Foundation Models issues — context exceeded, guardrail violations, slow generation, availability problems, unsupported language, or unexpected output. Systematic diagnostics with production crisis defense.
fair-data-model-assessment
Assess data models against FAIR principles using RDA-FDMM indicators. Use when: (1) Evaluating vendor-delivered data models for FAIR compliance, (2) Reviewing schemas, ontologies, or data dictionaries before integration, (3) Creating FAIR assessment reports for data governance reviews, (4) Preparing data model documentation for enterprise or regulatory standards, (5) Auditing existing data assets for FAIRness gaps. Covers 41 RDA indicators across Findable, Accessible, Interoperable, Reusable dimensions with maturity scoring (0-4 scale).
data-model
Generate comprehensive data model documentation with ERD, DTOs, and data flow diagrams
data-model-creation
Professional rules for AI-driven data modeling and creation. Use this skill when users need to create and manage MySQL databases, design data models using Mermaid ER diagrams, and implement database schemas.
Build Your Model Serving Skill
Create your model-serving skill from Ollama documentation before learning deployment theory
Build Your Model Merging Skill
No description provided.