when-developing-ml-models-use-ml-expert

Specialized ML model development, training, and deployment workflow

25 stars

Best use case

when-developing-ml-models-use-ml-expert is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Specialized ML model development, training, and deployment workflow

Teams using when-developing-ml-models-use-ml-expert should expect a more consistent output, faster repeated execution, less prompt rewriting, better workflow continuity with your supporting tools.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.
You already have the supporting tools or dependencies needed by this skill.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/when-developing-ml-models-use-ml-expert/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/aiskillstore/marketplace/dnyoussef/when-developing-ml-models-use-ml-expert/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/when-developing-ml-models-use-ml-expert/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How when-developing-ml-models-use-ml-expert Compares

Feature / Agent	when-developing-ml-models-use-ml-expert	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Specialized ML model development, training, and deployment workflow

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# ML Expert - Machine Learning Model Development

## Overview

Specialized workflow for ML model development, training, and deployment. Supports various architectures (CNNs, RNNs, Transformers) with distributed training capabilities.

## When to Use

- Developing new ML models
- Training neural networks
- Model optimization
- Production deployment
- Transfer learning
- Fine-tuning existing models

## Phase 1: Data Preparation (10 min)

### Objective
Clean, preprocess, and prepare training data

### Agent: ML-Developer

**Step 1.1: Load and Analyze Data**
```python
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

# Load data
data = pd.read_csv('dataset.csv')

# Analyze
analysis = {
    'shape': data.shape,
    'columns': data.columns.tolist(),
    'dtypes': data.dtypes.to_dict(),
    'missing': data.isnull().sum().to_dict(),
    'stats': data.describe().to_dict()
}

# Store analysis
await memory.store('ml-expert/data-analysis', analysis)
```

**Step 1.2: Data Cleaning**
```python
# Handle missing values
data = data.fillna(data.mean())

# Remove duplicates
data = data.drop_duplicates()

# Handle outliers
from scipy import stats
z_scores = np.abs(stats.zscore(data.select_dtypes(include=[np.number])))
data = data[(z_scores < 3).all(axis=1)]

# Encode categorical variables
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
for col in data.select_dtypes(include=['object']).columns:
    data[col] = le.fit_transform(data[col])
```

**Step 1.3: Split Data**
```python
# Split into train/val/test
X = data.drop('target', axis=1)
y = data['target']

X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.3, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

# Normalize
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_val = scaler.transform(X_val)
X_test = scaler.transform(X_test)

# Save preprocessed data
np.save('data/X_train.npy', X_train)
np.save('data/X_val.npy', X_val)
np.save('data/X_test.npy', X_test)
```

### Validation Criteria
- [ ] Data loaded successfully
- [ ] Missing values handled
- [ ] Train/val/test split created
- [ ] Normalization applied

## Phase 2: Model Selection (10 min)

### Objective
Choose and configure model architecture

### Agent: Researcher

**Step 2.1: Analyze Task Type**
```python
task_analysis = {
    'type': 'classification|regression|clustering',
    'complexity': 'low|medium|high',
    'dataSize': len(data),
    'features': X.shape[1],
    'classes': len(np.unique(y)) if classification else None
}

# Recommend architecture
if task_analysis['type'] == 'classification' and task_analysis['features'] > 100:
    recommended_architecture = 'deep_neural_network'
elif task_analysis['type'] == 'regression' and task_analysis['dataSize'] < 10000:
    recommended_architecture = 'random_forest'
# ... more logic
```

**Step 2.2: Define Architecture**
```python
import tensorflow as tf

def create_model(architecture='dnn', input_shape, num_classes):
    if architecture == 'dnn':
        model = tf.keras.Sequential([
            tf.keras.layers.Dense(256, activation='relu', input_shape=input_shape),
            tf.keras.layers.Dropout(0.3),
            tf.keras.layers.Dense(128, activation='relu'),
            tf.keras.layers.Dropout(0.3),
            tf.keras.layers.Dense(64, activation='relu'),
            tf.keras.layers.Dense(num_classes, activation='softmax')
        ])
    elif architecture == 'cnn':
        model = tf.keras.Sequential([
            tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=input_shape),
            tf.keras.layers.MaxPooling2D((2,2)),
            tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
            tf.keras.layers.MaxPooling2D((2,2)),
            tf.keras.layers.Flatten(),
            tf.keras.layers.Dense(128, activation='relu'),
            tf.keras.layers.Dense(num_classes, activation='softmax')
        ])
    # ... more architectures

    return model

model = create_model('dnn', input_shape=(X_train.shape[1],), num_classes=len(np.unique(y)))
model.summary()
```

**Step 2.3: Configure Training**
```python
training_config = {
    'optimizer': tf.keras.optimizers.Adam(learning_rate=0.001),
    'loss': 'sparse_categorical_crossentropy',
    'metrics': ['accuracy'],
    'batch_size': 32,
    'epochs': 100,
    'callbacks': [
        tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True),
        tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5),
        tf.keras.callbacks.ModelCheckpoint('best_model.h5', save_best_only=True)
    ]
}

model.compile(
    optimizer=training_config['optimizer'],
    loss=training_config['loss'],
    metrics=training_config['metrics']
)
```

### Validation Criteria
- [ ] Task analyzed
- [ ] Architecture selected
- [ ] Model configured
- [ ] Training parameters set

## Phase 3: Train Model (20 min)

### Objective
Execute training with monitoring

### Agent: ML-Developer

**Step 3.1: Start Training**
```python
# Train model
history = model.fit(
    X_train, y_train,
    validation_data=(X_val, y_val),
    batch_size=training_config['batch_size'],
    epochs=training_config['epochs'],
    callbacks=training_config['callbacks'],
    verbose=1
)

# Save training history
import json
with open('training_history.json', 'w') as f:
    json.dump({
        'loss': history.history['loss'],
        'val_loss': history.history['val_loss'],
        'accuracy': history.history['accuracy'],
        'val_accuracy': history.history['val_accuracy']
    }, f)
```

**Step 3.2: Monitor Training**
```python
import matplotlib.pyplot as plt

# Plot training curves
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))

# Loss
ax1.plot(history.history['loss'], label='Train Loss')
ax1.plot(history.history['val_loss'], label='Val Loss')
ax1.set_xlabel('Epoch')
ax1.set_ylabel('Loss')
ax1.legend()
ax1.grid(True)

# Accuracy
ax2.plot(history.history['accuracy'], label='Train Accuracy')
ax2.plot(history.history['val_accuracy'], label='Val Accuracy')
ax2.set_xlabel('Epoch')
ax2.set_ylabel('Accuracy')
ax2.legend()
ax2.grid(True)

plt.savefig('training_curves.png')
```

**Step 3.3: Distributed Training (Optional)**
```python
# Using Flow-Nexus for distributed training
from flow_nexus import DistributedTrainer

trainer = DistributedTrainer({
    'cluster_id': 'ml-training-cluster',
    'num_nodes': 4,
    'strategy': 'data_parallel'
})

# Train across multiple nodes
trainer.fit(
    model=model,
    train_data=(X_train, y_train),
    val_data=(X_val, y_val),
    config=training_config
)
```

### Validation Criteria
- [ ] Training completed
- [ ] No NaN losses
- [ ] Validation metrics improving
- [ ] Model checkpoints saved

## Phase 4: Validate Performance (10 min)

### Objective
Evaluate model on test set

### Agent: Tester

**Step 4.1: Evaluate on Test Set**
```python
# Load best model
best_model = tf.keras.models.load_model('best_model.h5')

# Evaluate
test_loss, test_accuracy = best_model.evaluate(X_test, y_test, verbose=0)

# Predictions
y_pred = best_model.predict(X_test)
y_pred_classes = np.argmax(y_pred, axis=1)

# Detailed metrics
from sklearn.metrics import classification_report, confusion_matrix

metrics = {
    'test_loss': float(test_loss),
    'test_accuracy': float(test_accuracy),
    'classification_report': classification_report(y_test, y_pred_classes, output_dict=True),
    'confusion_matrix': confusion_matrix(y_test, y_pred_classes).tolist()
}

await memory.store('ml-expert/metrics', metrics)
```

**Step 4.2: Generate Evaluation Report**
```python
report = f"""
# Model Evaluation Report

## Performance Metrics
- Test Loss: {test_loss:.4f}
- Test Accuracy: {test_accuracy:.4f}

## Classification Report
{classification_report(y_test, y_pred_classes)}

## Model Summary
- Architecture: {recommended_architecture}
- Parameters: {model.count_params()}
- Training Time: {training_time} seconds

## Training History
- Best Val Loss: {min(history.history['val_loss']):.4f}
- Best Val Accuracy: {max(history.history['val_accuracy']):.4f}
- Epochs Trained: {len(history.history['loss'])}
"""

with open('evaluation_report.md', 'w') as f:
    f.write(report)
```

### Validation Criteria
- [ ] Test accuracy > target threshold
- [ ] No overfitting detected
- [ ] Metrics documented
- [ ] Report generated

## Phase 5: Deploy to Production (15 min)

### Objective
Package model for deployment

### Agent: ML-Developer

**Step 5.1: Export Model**
```python
# Save in multiple formats
model.save('model.h5')  # Keras format
model.save('model_savedmodel')  # TensorFlow SavedModel
# Convert to TFLite for mobile
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
with open('model.tflite', 'wb') as f:
    f.write(tflite_model)

# Save preprocessing pipeline
import joblib
joblib.dump(scaler, 'scaler.pkl')
```

**Step 5.2: Create Deployment Package**
```python
import shutil
import os

# Create deployment directory
os.makedirs('deployment', exist_ok=True)

# Copy necessary files
shutil.copy('model.h5', 'deployment/')
shutil.copy('scaler.pkl', 'deployment/')
shutil.copy('evaluation_report.md', 'deployment/')

# Create inference script
inference_script = '''
import tensorflow as tf
import joblib
import numpy as np

class ModelInference:
    def __init__(self, model_path, scaler_path):
        self.model = tf.keras.models.load_model(model_path)
        self.scaler = joblib.load(scaler_path)

    def predict(self, input_data):
        # Preprocess
        scaled = self.scaler.transform(input_data)
        # Predict
        predictions = self.model.predict(scaled)
        return np.argmax(predictions, axis=1)

# Usage
inference = ModelInference('model.h5', 'scaler.pkl')
result = inference.predict(new_data)
'''

with open('deployment/inference.py', 'w') as f:
    f.write(inference_script)
```

**Step 5.3: Generate Documentation**
```markdown
# Model Deployment Guide

## Files
- `model.h5`: Trained Keras model
- `scaler.pkl`: Preprocessing scaler
- `inference.py`: Inference script

## Usage
\`\`\`python
from inference import ModelInference

model = ModelInference('model.h5', 'scaler.pkl')
predictions = model.predict(new_data)
\`\`\`

## Performance
- Latency: < 50ms per prediction
- Accuracy: ${test_accuracy}
- Model Size: ${model_size}MB

## Requirements
- tensorflow>=2.0
- scikit-learn
- numpy
```

### Validation Criteria
- [ ] Model exported successfully
- [ ] Deployment package created
- [ ] Inference script tested
- [ ] Documentation complete

## Success Metrics

- Test accuracy meets target
- Training converged
- No overfitting
- Production-ready deployment

## Skill Completion

Outputs:
1. **model.h5**: Trained model file
2. **evaluation_report.md**: Performance metrics
3. **deployment/**: Production package
4. **training_history.json**: Training logs

Complete when model deployed and validated.

Related Skills

vertex-infra-expert

from ComeOnOliver/skillshub

Terraform infrastructure specialist for Vertex AI services and Gemini deployments. Provisions Model Garden, endpoints, vector search, pipelines, and enterprise AI infrastructure. Triggers: "vertex ai terraform", "gemini deployment terraform", "model garden infrastructure", "vertex ai endpoints"

validator-expert

from ComeOnOliver/skillshub

Validate production readiness of Vertex AI Agent Engine deployments across security, monitoring, performance, compliance, and best practices. Generates weighted scores (0-100%) with actionable remediation plans. Use when asked to validate a deployment, run a production readiness check, audit security posture, or verify compliance for Vertex AI agents. Trigger with "validate deployment", "production readiness", "security audit", "compliance check", "is this agent ready for prod", "check my ADK agent", "review before deploy", or "production readiness check". Make sure to use this skill whenever validating ADK agents for Agent Engine.

adapting-transfer-learning-models

from ComeOnOliver/skillshub

This skill automates the adaptation of pre-trained machine learning models using transfer learning techniques. It is triggered when the user requests assistance with fine-tuning a model, adapting a pre-trained model to a new dataset, or performing transfer learning. It analyzes the user's requirements, generates code for adapting the model, includes data validation and error handling, provides performance metrics, and saves artifacts with documentation. Use this skill when you need to leverage existing models for new tasks or datasets, optimizing for performance and efficiency.

training-machine-learning-models

from ComeOnOliver/skillshub

Build train machine learning models with automated workflows. Analyzes datasets, selects model types (classification, regression), configures parameters, trains with cross-validation, and saves model artifacts. Use when asked to "train model" or "evalua... Trigger with relevant phrases based on skill purpose.

evaluating-machine-learning-models

from ComeOnOliver/skillshub

This skill allows Claude to evaluate machine learning models using a comprehensive suite of metrics. It should be used when the user requests model performance analysis, validation, or testing. Claude can use this skill to assess model accuracy, precision, recall, F1-score, and other relevant metrics. Trigger this skill when the user mentions "evaluate model", "model performance", "testing metrics", "validation results", or requests a comprehensive "model evaluation".

deploying-machine-learning-models

from ComeOnOliver/skillshub

This skill enables Claude to deploy machine learning models to production environments. It automates the deployment workflow, implements best practices for serving models, optimizes performance, and handles potential errors. Use this skill when the user requests to deploy a model, serve a model via an API, or put a trained model into a production environment. The skill is triggered by requests containing terms like "deploy model," "productionize model," "serve model," or "model deployment."

genkit-production-expert

from ComeOnOliver/skillshub

Build production Firebase Genkit applications including RAG systems, multi-step flows, and tool calling for Node.js/Python/Go. Deploy to Firebase Functions or Cloud Run with AI monitoring. Use when asked to "create genkit flow" or "implement RAG". Trigger with relevant phrases based on skill purpose.

genkit-infra-expert

from ComeOnOliver/skillshub

Terraform infrastructure specialist for deploying Genkit applications to production. Provisions Firebase Functions, Cloud Run services, GKE clusters, monitoring, and CI/CD for Genkit AI workflows. Triggers: "deploy genkit terraform", "genkit infrastructure", "firebase functions terraform", "cloud run genkit"

gcp-examples-expert

from ComeOnOliver/skillshub

Generate production-ready Google Cloud code examples from official repositories including ADK samples, Genkit templates, Vertex AI notebooks, and Gemini patterns. Use when asked to "show ADK example" or "provide GCP starter kit". Trigger with relevant phrases based on skill purpose.

explaining-machine-learning-models

from ComeOnOliver/skillshub

Build this skill enables AI assistant to provide interpretability and explainability for machine learning models. it is triggered when the user requests explanations for model predictions, insights into feature importance, or help understanding model behavior... Use when appropriate context detected. Trigger with relevant phrases based on skill purpose.

optimizing-deep-learning-models

from ComeOnOliver/skillshub

This skill optimizes deep learning models using various techniques. It is triggered when the user requests improvements to model performance, such as increasing accuracy, reducing training time, or minimizing resource consumption. The skill leverages advanced optimization algorithms like Adam, SGD, and learning rate scheduling. It analyzes the existing model architecture, training data, and performance metrics to identify areas for enhancement. The skill then automatically applies appropriate optimization strategies and generates optimized code. Use this skill when the user mentions "optimize deep learning model", "improve model accuracy", "reduce training time", or "optimize learning rate".

building-classification-models

from ComeOnOliver/skillshub

This skill enables Claude to construct and evaluate classification models using provided datasets or specifications. It leverages the classification-model-builder plugin to automate model creation, optimization, and reporting. Use this skill when the user requests to "build a classifier", "create a classification model", "train a classification model", or needs help with supervised learning tasks involving labeled data. The skill ensures best practices are followed, including data validation, error handling, and performance metric reporting.