ClearML — Open-Source ML Operations

## Overview

25 stars

Best use case

ClearML — Open-Source ML Operations is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

## Overview

Teams using ClearML — Open-Source ML Operations should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/clearml/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/TerminalSkills/skills/clearml/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/clearml/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How ClearML — Open-Source ML Operations Compares

Feature / AgentClearML — Open-Source ML OperationsStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

## Overview

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# ClearML — Open-Source ML Operations


## Overview


ClearML, the open-source MLOps platform for experiment tracking, pipeline orchestration, data management, and model deployment. Helps developers set up ML experiment tracking with minimal code, build reproducible pipelines, and manage the full ML lifecycle from training to serving.


## Instructions

### Experiment Tracking (Two Lines of Code)

```python
# train.py — Automatic experiment tracking
from clearml import Task

# Just these two lines auto-capture everything:
# - Git repo, branch, and diff
# - All installed packages
# - CLI arguments
# - stdout/stderr
# - Framework metrics (PyTorch, TensorFlow, scikit-learn)
task = Task.init(project_name="NLP", task_name="sentiment-classifier-v2")

# All print statements, matplotlib plots, and framework metrics
# are automatically captured — zero additional code needed

import torch
from transformers import AutoModelForSequenceClassification, Trainer, TrainingArguments

model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased", num_labels=3)

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=32,
    learning_rate=2e-5,
    evaluation_strategy="epoch",
    logging_steps=50,
    # ClearML auto-captures all these parameters
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
)

# Training metrics automatically logged to ClearML dashboard
trainer.train()

# Explicitly log additional data
task.get_logger().report_scalar("custom", "metric", value=0.95, iteration=100)
task.upload_artifact("model_weights", artifact_object="./results/pytorch_model.bin")
```

### Pipeline Orchestration

```python
# pipeline.py — ML pipeline with ClearML
from clearml import PipelineController

pipe = PipelineController(
    name="Training Pipeline",
    project="NLP",
    version="1.0",
)

# Step 1: Data preprocessing
pipe.add_step(
    name="preprocess",
    base_task_project="NLP",
    base_task_name="data-preprocess",       # Reference an existing task template
    parameter_override={
        "General/dataset_version": "v2.1",
        "General/max_samples": 50000,
    },
)

# Step 2: Training (depends on preprocessing)
pipe.add_step(
    name="train",
    parents=["preprocess"],
    base_task_project="NLP",
    base_task_name="train-model",
    parameter_override={
        "General/epochs": 5,
        "General/learning_rate": "${preprocess.learning_rate}",  # Reference parent output
    },
)

# Step 3: Evaluation
pipe.add_step(
    name="evaluate",
    parents=["train"],
    base_task_project="NLP",
    base_task_name="evaluate-model",
)

# Step 4: Deploy if metrics meet threshold
pipe.add_step(
    name="deploy",
    parents=["evaluate"],
    base_task_project="NLP",
    base_task_name="deploy-model",
    pre_execute_callback=lambda pipeline, node, params: {
        # Only deploy if accuracy > 0.9
        pipeline.get_step("evaluate").get_metric("accuracy") > 0.9
    },
)

# Run the pipeline
pipe.start()
```

### Data Management

```python
# data_versioning.py — Version and manage datasets
from clearml import Dataset

# Create a versioned dataset
dataset = Dataset.create(
    dataset_name="customer-reviews-v2",
    dataset_project="NLP",
    description="Customer reviews with sentiment labels, cleaned and deduplicated",
)

# Add files
dataset.add_files(path="./data/reviews.parquet")
dataset.add_files(path="./data/labels.csv")

# Upload and finalize (creates immutable version)
dataset.upload()
dataset.finalize()
print(f"Dataset ID: {dataset.id}")

# Use the dataset in training
dataset = Dataset.get(
    dataset_name="customer-reviews-v2",
    dataset_project="NLP",
)
local_path = dataset.get_local_copy()    # Downloads and caches locally
# local_path now points to a directory with reviews.parquet and labels.csv

# Create a new version (inherits from parent)
new_version = Dataset.create(
    dataset_name="customer-reviews-v3",
    dataset_project="NLP",
    parent_datasets=[dataset.id],         # Inherits files from v2
)
new_version.add_files("./data/new_reviews.parquet")  # Add new data
new_version.remove_files("data/old_labels.csv")      # Remove outdated files
new_version.upload()
new_version.finalize()
```

### Remote Execution (ClearML Agent)

```python
# Run any task on remote machines with ClearML Agent
from clearml import Task

task = Task.init(project_name="NLP", task_name="train-large-model")

# This task was created locally, but we can clone and run it remotely
task.execute_remotely(queue_name="gpu-queue")

# Everything after this line runs on the remote machine
# ClearML Agent handles:
# - Setting up the environment (pip install, git clone)
# - Downloading datasets
# - Running the code
# - Uploading results and artifacts
```

```bash
# Start a ClearML Agent on a GPU machine
clearml-agent daemon --queue gpu-queue --gpus 0

# Or with Docker isolation
clearml-agent daemon --queue gpu-queue --docker --gpus all
```

### Hyperparameter Optimization

```python
# hpo.py — Automated hyperparameter search
from clearml import Task
from clearml.automation import HyperParameterOptimizer, UniformParameterRange, DiscreteParameterRange

optimizer = HyperParameterOptimizer(
    base_task_id="<template-task-id>",     # Task to optimize
    hyper_parameters=[
        UniformParameterRange("General/learning_rate", min_value=1e-5, max_value=1e-3),
        UniformParameterRange("General/weight_decay", min_value=0, max_value=0.1),
        DiscreteParameterRange("General/batch_size", values=[16, 32, 64]),
        DiscreteParameterRange("General/epochs", values=[3, 5, 10]),
    ],
    objective_metric_title="eval",
    objective_metric_series="f1",
    objective_metric_sign="max",            # Maximize F1 score
    max_number_of_concurrent_tasks=4,
    optimizer_class="OptimizerBOHB",        # Bayesian optimization
    execution_queue="gpu-queue",
    total_max_jobs=50,
)

optimizer.start()
optimizer.wait()

# Get the best configuration
best = optimizer.get_top_experiments(top_k=1)[0]
print(f"Best F1: {best.get_metric('eval', 'f1')}")
print(f"Best params: {best.get_parameters()}")
```

## Installation

```bash
# Python SDK
pip install clearml

# Configure (interactive — sets API credentials)
clearml-init

# Self-hosted server (Docker Compose)
docker compose -f docker-compose.yml up -d
# Dashboard at http://localhost:8080

# Or use ClearML Cloud (free tier available)
# https://app.clear.ml
```


## Examples


### Example 1: Setting up an evaluation pipeline for a RAG application

**User request:**

```
I have a RAG chatbot that answers questions from our docs. Set up Clearml to evaluate answer quality.
```

The agent creates an evaluation suite with appropriate metrics (faithfulness, relevance, answer correctness), configures test datasets from real user questions, runs baseline evaluations, and sets up CI integration so evaluations run on every prompt or retrieval change.

### Example 2: Comparing model performance across prompts

**User request:**

```
We're testing GPT-4o vs Claude on our customer support prompts. Set up a comparison with Clearml.
```

The agent creates a structured experiment with the existing prompt set, configures both model providers, defines scoring criteria specific to customer support (accuracy, tone, completeness), runs the comparison, and generates a summary report with statistical significance indicators.


## Guidelines

1. **Two lines to start** — `Task.init()` auto-captures everything; add explicit logging only for custom metrics
2. **Use dataset versioning** — Version your training data alongside code; reproducibility requires both
3. **Remote execution for GPU work** — Develop locally, run on GPU machines with `execute_remotely()`; no SSH needed
4. **Pipeline for reproducibility** — Define training pipelines as code; each run is fully reproducible with tracked inputs/outputs
5. **Queue-based execution** — Use queues to route tasks to appropriate hardware (CPU queue, GPU queue, high-memory queue)
6. **HPO with Bayesian optimization** — Use BOHB optimizer for efficient hyperparameter search; better than grid/random search
7. **Self-host for privacy** — Run the ClearML server on your own infrastructure; all data stays in your network
8. **Compare experiments in dashboard** — Use the web UI to overlay training curves, compare hyperparameters, and identify winners

Related Skills

tracking-resource-usage

25
from ComeOnOliver/skillshub

Track and optimize resource usage across application stack including CPU, memory, disk, and network I/O. Use when identifying bottlenecks or optimizing costs. Trigger with phrases like "track resource usage", "monitor CPU and memory", or "optimize resource allocation".

openapi-spec-generator

25
from ComeOnOliver/skillshub

Openapi Spec Generator - Auto-activating skill for API Development. Triggers on: openapi spec generator, openapi spec generator Part of the API Development skill category.

open-graph-creator

25
from ComeOnOliver/skillshub

Open Graph Creator - Auto-activating skill for Frontend Development. Triggers on: open graph creator, open graph creator Part of the Frontend Development skill category.

gpu-resource-optimizer

25
from ComeOnOliver/skillshub

Gpu Resource Optimizer - Auto-activating skill for ML Deployment. Triggers on: gpu resource optimizer, gpu resource optimizer Part of the ML Deployment skill category.

firestore-operations-manager

25
from ComeOnOliver/skillshub

Manage Firebase/Firestore operations including CRUD, queries, batch processing, and index/rule guidance. Use when you need to create/update/query Firestore documents, run batch writes, troubleshoot missing indexes, or plan migrations. Trigger with phrases like "firestore operations", "create firestore document", "batch write", "missing index", or "fix firestore query".

provider-resources

25
from ComeOnOliver/skillshub

Implement Terraform Provider resources and data sources using the Plugin Framework. Use when developing CRUD operations, schema design, state management, and acceptance testing for provider resources.

typespec-api-operations

25
from ComeOnOliver/skillshub

Add GET, POST, PATCH, and DELETE operations to a TypeSpec API plugin with proper routing, parameters, and adaptive cards

openapi-to-application-code

25
from ComeOnOliver/skillshub

Generate a complete, production-ready application from an OpenAPI specification

azure-resource-health-diagnose

25
from ComeOnOliver/skillshub

Analyze Azure resource health, diagnose issues from logs and telemetry, and create a remediation plan for identified problems.

aspnet-minimal-api-openapi

25
from ComeOnOliver/skillshub

Create ASP.NET Minimal API endpoints with proper OpenAPI documentation

opencode-learn

25
from ComeOnOliver/skillshub

Extracts actionable knowledge from external sources and enhances existing skills using a 4-tier novelty framework. Use PROACTIVELY when a user says "/learn <source>", provides documentation URLs, code examples, or explicitly asks to extract patterns from a repository or marketplace.

OpenAI Whisper API (curl)

25
from ComeOnOliver/skillshub

Transcribe an audio file via OpenAI’s `/v1/audio/transcriptions` endpoint.