when-developing-ml-models-use-ml-expert

Specialized ML model development, training, and deployment workflow

242 stars

Best use case

when-developing-ml-models-use-ml-expert is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Specialized ML model development, training, and deployment workflow

Specialized ML model development, training, and deployment workflow

Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.

Practical example

Example input

Use the "when-developing-ml-models-use-ml-expert" skill to help with this workflow task. Context: Specialized ML model development, training, and deployment workflow

Example output

A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.

When to use this skill

Use this skill when you want a reusable workflow rather than writing the same prompt again and again.
Use it when you already have the supporting tools or dependencies needed by the workflow.

When not to use this skill

Do not use this when you only need a one-off answer and do not need a reusable workflow.
Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/when-developing-ml-models-use-ml-expert/SKILL.md --create-dirs "https://raw.githubusercontent.com/aiskillstore/marketplace/main/skills/dnyoussef/when-developing-ml-models-use-ml-expert/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/when-developing-ml-models-use-ml-expert/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How when-developing-ml-models-use-ml-expert Compares

Feature / Agent	when-developing-ml-models-use-ml-expert	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Specialized ML model development, training, and deployment workflow

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

SKILL.md Source

# ML Expert - Machine Learning Model Development

## Overview

Specialized workflow for ML model development, training, and deployment. Supports various architectures (CNNs, RNNs, Transformers) with distributed training capabilities.

## When to Use

- Developing new ML models
- Training neural networks
- Model optimization
- Production deployment
- Transfer learning
- Fine-tuning existing models

## Phase 1: Data Preparation (10 min)

### Objective
Clean, preprocess, and prepare training data

### Agent: ML-Developer

**Step 1.1: Load and Analyze Data**
```python
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

# Load data
data = pd.read_csv('dataset.csv')

# Analyze
analysis = {
    'shape': data.shape,
    'columns': data.columns.tolist(),
    'dtypes': data.dtypes.to_dict(),
    'missing': data.isnull().sum().to_dict(),
    'stats': data.describe().to_dict()
}

# Store analysis
await memory.store('ml-expert/data-analysis', analysis)
```

**Step 1.2: Data Cleaning**
```python
# Handle missing values
data = data.fillna(data.mean())

# Remove duplicates
data = data.drop_duplicates()

# Handle outliers
from scipy import stats
z_scores = np.abs(stats.zscore(data.select_dtypes(include=[np.number])))
data = data[(z_scores < 3).all(axis=1)]

# Encode categorical variables
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
for col in data.select_dtypes(include=['object']).columns:
    data[col] = le.fit_transform(data[col])
```

**Step 1.3: Split Data**
```python
# Split into train/val/test
X = data.drop('target', axis=1)
y = data['target']

X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.3, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

# Normalize
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_val = scaler.transform(X_val)
X_test = scaler.transform(X_test)

# Save preprocessed data
np.save('data/X_train.npy', X_train)
np.save('data/X_val.npy', X_val)
np.save('data/X_test.npy', X_test)
```

### Validation Criteria
- [ ] Data loaded successfully
- [ ] Missing values handled
- [ ] Train/val/test split created
- [ ] Normalization applied

## Phase 2: Model Selection (10 min)

### Objective
Choose and configure model architecture

### Agent: Researcher

**Step 2.1: Analyze Task Type**
```python
task_analysis = {
    'type': 'classification|regression|clustering',
    'complexity': 'low|medium|high',
    'dataSize': len(data),
    'features': X.shape[1],
    'classes': len(np.unique(y)) if classification else None
}

# Recommend architecture
if task_analysis['type'] == 'classification' and task_analysis['features'] > 100:
    recommended_architecture = 'deep_neural_network'
elif task_analysis['type'] == 'regression' and task_analysis['dataSize'] < 10000:
    recommended_architecture = 'random_forest'
# ... more logic
```

**Step 2.2: Define Architecture**
```python
import tensorflow as tf

def create_model(architecture='dnn', input_shape, num_classes):
    if architecture == 'dnn':
        model = tf.keras.Sequential([
            tf.keras.layers.Dense(256, activation='relu', input_shape=input_shape),
            tf.keras.layers.Dropout(0.3),
            tf.keras.layers.Dense(128, activation='relu'),
            tf.keras.layers.Dropout(0.3),
            tf.keras.layers.Dense(64, activation='relu'),
            tf.keras.layers.Dense(num_classes, activation='softmax')
        ])
    elif architecture == 'cnn':
        model = tf.keras.Sequential([
            tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=input_shape),
            tf.keras.layers.MaxPooling2D((2,2)),
            tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
            tf.keras.layers.MaxPooling2D((2,2)),
            tf.keras.layers.Flatten(),
            tf.keras.layers.Dense(128, activation='relu'),
            tf.keras.layers.Dense(num_classes, activation='softmax')
        ])
    # ... more architectures

    return model

model = create_model('dnn', input_shape=(X_train.shape[1],), num_classes=len(np.unique(y)))
model.summary()
```

**Step 2.3: Configure Training**
```python
training_config = {
    'optimizer': tf.keras.optimizers.Adam(learning_rate=0.001),
    'loss': 'sparse_categorical_crossentropy',
    'metrics': ['accuracy'],
    'batch_size': 32,
    'epochs': 100,
    'callbacks': [
        tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True),
        tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5),
        tf.keras.callbacks.ModelCheckpoint('best_model.h5', save_best_only=True)
    ]
}

model.compile(
    optimizer=training_config['optimizer'],
    loss=training_config['loss'],
    metrics=training_config['metrics']
)
```

### Validation Criteria
- [ ] Task analyzed
- [ ] Architecture selected
- [ ] Model configured
- [ ] Training parameters set

## Phase 3: Train Model (20 min)

### Objective
Execute training with monitoring

### Agent: ML-Developer

**Step 3.1: Start Training**
```python
# Train model
history = model.fit(
    X_train, y_train,
    validation_data=(X_val, y_val),
    batch_size=training_config['batch_size'],
    epochs=training_config['epochs'],
    callbacks=training_config['callbacks'],
    verbose=1
)

# Save training history
import json
with open('training_history.json', 'w') as f:
    json.dump({
        'loss': history.history['loss'],
        'val_loss': history.history['val_loss'],
        'accuracy': history.history['accuracy'],
        'val_accuracy': history.history['val_accuracy']
    }, f)
```

**Step 3.2: Monitor Training**
```python
import matplotlib.pyplot as plt

# Plot training curves
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))

# Loss
ax1.plot(history.history['loss'], label='Train Loss')
ax1.plot(history.history['val_loss'], label='Val Loss')
ax1.set_xlabel('Epoch')
ax1.set_ylabel('Loss')
ax1.legend()
ax1.grid(True)

# Accuracy
ax2.plot(history.history['accuracy'], label='Train Accuracy')
ax2.plot(history.history['val_accuracy'], label='Val Accuracy')
ax2.set_xlabel('Epoch')
ax2.set_ylabel('Accuracy')
ax2.legend()
ax2.grid(True)

plt.savefig('training_curves.png')
```

**Step 3.3: Distributed Training (Optional)**
```python
# Using Flow-Nexus for distributed training
from flow_nexus import DistributedTrainer

trainer = DistributedTrainer({
    'cluster_id': 'ml-training-cluster',
    'num_nodes': 4,
    'strategy': 'data_parallel'
})

# Train across multiple nodes
trainer.fit(
    model=model,
    train_data=(X_train, y_train),
    val_data=(X_val, y_val),
    config=training_config
)
```

### Validation Criteria
- [ ] Training completed
- [ ] No NaN losses
- [ ] Validation metrics improving
- [ ] Model checkpoints saved

## Phase 4: Validate Performance (10 min)

### Objective
Evaluate model on test set

### Agent: Tester

**Step 4.1: Evaluate on Test Set**
```python
# Load best model
best_model = tf.keras.models.load_model('best_model.h5')

# Evaluate
test_loss, test_accuracy = best_model.evaluate(X_test, y_test, verbose=0)

# Predictions
y_pred = best_model.predict(X_test)
y_pred_classes = np.argmax(y_pred, axis=1)

# Detailed metrics
from sklearn.metrics import classification_report, confusion_matrix

metrics = {
    'test_loss': float(test_loss),
    'test_accuracy': float(test_accuracy),
    'classification_report': classification_report(y_test, y_pred_classes, output_dict=True),
    'confusion_matrix': confusion_matrix(y_test, y_pred_classes).tolist()
}

await memory.store('ml-expert/metrics', metrics)
```

**Step 4.2: Generate Evaluation Report**
```python
report = f"""
# Model Evaluation Report

## Performance Metrics
- Test Loss: {test_loss:.4f}
- Test Accuracy: {test_accuracy:.4f}

## Classification Report
{classification_report(y_test, y_pred_classes)}

## Model Summary
- Architecture: {recommended_architecture}
- Parameters: {model.count_params()}
- Training Time: {training_time} seconds

## Training History
- Best Val Loss: {min(history.history['val_loss']):.4f}
- Best Val Accuracy: {max(history.history['val_accuracy']):.4f}
- Epochs Trained: {len(history.history['loss'])}
"""

with open('evaluation_report.md', 'w') as f:
    f.write(report)
```

### Validation Criteria
- [ ] Test accuracy > target threshold
- [ ] No overfitting detected
- [ ] Metrics documented
- [ ] Report generated

## Phase 5: Deploy to Production (15 min)

### Objective
Package model for deployment

### Agent: ML-Developer

**Step 5.1: Export Model**
```python
# Save in multiple formats
model.save('model.h5')  # Keras format
model.save('model_savedmodel')  # TensorFlow SavedModel
# Convert to TFLite for mobile
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
with open('model.tflite', 'wb') as f:
    f.write(tflite_model)

# Save preprocessing pipeline
import joblib
joblib.dump(scaler, 'scaler.pkl')
```

**Step 5.2: Create Deployment Package**
```python
import shutil
import os

# Create deployment directory
os.makedirs('deployment', exist_ok=True)

# Copy necessary files
shutil.copy('model.h5', 'deployment/')
shutil.copy('scaler.pkl', 'deployment/')
shutil.copy('evaluation_report.md', 'deployment/')

# Create inference script
inference_script = '''
import tensorflow as tf
import joblib
import numpy as np

class ModelInference:
    def __init__(self, model_path, scaler_path):
        self.model = tf.keras.models.load_model(model_path)
        self.scaler = joblib.load(scaler_path)

    def predict(self, input_data):
        # Preprocess
        scaled = self.scaler.transform(input_data)
        # Predict
        predictions = self.model.predict(scaled)
        return np.argmax(predictions, axis=1)

# Usage
inference = ModelInference('model.h5', 'scaler.pkl')
result = inference.predict(new_data)
'''

with open('deployment/inference.py', 'w') as f:
    f.write(inference_script)
```

**Step 5.3: Generate Documentation**
```markdown
# Model Deployment Guide

## Files
- `model.h5`: Trained Keras model
- `scaler.pkl`: Preprocessing scaler
- `inference.py`: Inference script

## Usage
\`\`\`python
from inference import ModelInference

model = ModelInference('model.h5', 'scaler.pkl')
predictions = model.predict(new_data)
\`\`\`

## Performance
- Latency: < 50ms per prediction
- Accuracy: ${test_accuracy}
- Model Size: ${model_size}MB

## Requirements
- tensorflow>=2.0
- scikit-learn
- numpy
```

### Validation Criteria
- [ ] Model exported successfully
- [ ] Deployment package created
- [ ] Inference script tested
- [ ] Documentation complete

## Success Metrics

- Test accuracy meets target
- Training converged
- No overfitting
- Production-ready deployment

## Skill Completion

Outputs:
1. **model.h5**: Trained model file
2. **evaluation_report.md**: Performance metrics
3. **deployment/**: Production package
4. **training_history.json**: Training logs

Complete when model deployed and validated.

Related Skills

typescript-expert

242

from aiskillstore/marketplace

TypeScript and JavaScript expert with deep knowledge of type-level programming, performance optimization, monorepo management, migration strategies, and modern tooling. Use PROACTIVELY for any TypeScript/JavaScript issues including complex type gymnastics, build performance, debugging, and architectural decisions. If a specialized expert is a better fit, I will recommend switching and stop.

threat-modeling-expert

242

from aiskillstore/marketplace

Expert in threat modeling methodologies, security architecture review, and risk assessment. Masters STRIDE, PASTA, attack trees, and security requirement extraction. Use for security architecture reviews, threat identification, and secure-by-design planning.

service-mesh-expert

242

from aiskillstore/marketplace

Expert service mesh architect specializing in Istio, Linkerd, and cloud-native networking patterns. Masters traffic management, security policies, observability integration, and multi-cluster mesh con

pydantic-models-py

242

from aiskillstore/marketplace

Create Pydantic models following the multi-model pattern with Base, Create, Update, Response, and InDB variants. Use when defining API request/response schemas, database models, or data validation in Python applications using Pydantic v2.

prisma-expert

242

from aiskillstore/marketplace

Prisma ORM expert for schema design, migrations, query optimization, relations modeling, and database operations. Use PROACTIVELY for Prisma schema issues, migration problems, query performance, relation design, or database connection issues.

nosql-expert

242

from aiskillstore/marketplace

Expert guidance for distributed NoSQL databases (Cassandra, DynamoDB). Focuses on mental models, query-first modeling, single-table design, and avoiding hot partitions in high-scale systems.

nestjs-expert

242

from aiskillstore/marketplace

Nest.js framework expert specializing in module architecture, dependency injection, middleware, guards, interceptors, testing with Jest/Supertest, TypeORM/Mongoose integration, and Passport.js authentication. Use PROACTIVELY for any Nest.js application issues including architecture decisions, testing strategies, performance optimization, or debugging complex dependency injection problems. If a specialized expert is a better fit, I will recommend switching and stop.

n8n-mcp-tools-expert

242

from aiskillstore/marketplace

Expert guide for using n8n-mcp MCP tools effectively. Use when searching for nodes, validating configurations, accessing templates, managing workflows, or using any n8n-mcp tool. Provides tool selection guidance, parameter formats, and common patterns.

mermaid-expert

242

from aiskillstore/marketplace

Create Mermaid diagrams for flowcharts, sequences, ERDs, and architectures. Masters syntax for all diagram types and styling. Use PROACTIVELY for visual documentation, system diagrams, or process flows.

laravel-expert

242

from aiskillstore/marketplace

Senior Laravel Engineer role for production-grade, maintainable, and idiomatic Laravel solutions. Focuses on clean architecture, security, performance, and modern standards (Laravel 10/11+).

kotlin-coroutines-expert

242

from aiskillstore/marketplace

Expert patterns for Kotlin Coroutines and Flow, covering structured concurrency, error handling, and testing.

computer-vision-expert

242

from aiskillstore/marketplace

SOTA Computer Vision Expert (2026). Specialized in YOLO26, Segment Anything 3 (SAM 3), Vision Language Models, and real-time spatial analysis.