hugging-face-trackio

Track ML experiments with Trackio using Python logging, alerts, and CLI metric retrieval.

31,392 stars
Complexity: medium

About this skill

Hugging Face Trackio is an experiment tracking library designed for AI agents to monitor and manage Machine Learning training. It integrates seamlessly with Hugging Face Spaces, providing real-time dashboards for visualizing training metrics. Agents can leverage Trackio's Python API to log diverse metrics during training, set up sophisticated alerts for diagnostics (e.g., performance drops, anomalies), and use a CLI to retrieve historical metrics and alerts post-training. This skill empowers AI agents to autonomously oversee ML model development, ensure optimal performance, and react to critical events without constant human intervention.

Best use case

An AI agent needing to self-monitor the training progress of its own models; an agent managing a suite of ML experiments and requiring automated alerts for anomalies or specific performance thresholds; an agent tasked with generating reports on past experiment outcomes and needing programmatic access to historical data.

Track ML experiments with Trackio using Python logging, alerts, and CLI metric retrieval.

The AI agent will successfully log training metrics, receive automated alerts based on predefined conditions, and be able to retrieve comprehensive experiment data. Users can expect improved visibility into ML model training, proactive issue detection, and streamlined data analysis through Trackio's integration with Hugging Face Spaces dashboards.

Practical example

Example input

```json
{
  "tool_code": "trackio.log_metric(name='validation_accuracy', value=0.92, step=15)"
}
```

_Agent Instruction: "Configure an alert to fire if the training loss exceeds 0.5 for more than 3 consecutive steps."_

```json
{
  "tool_code": "trackio.set_alert(metric='training_loss', condition='>0.5', consecutive_steps=3, severity='critical')"
}
```

_Agent Instruction: "Retrieve the average F1-score for my latest experiment."_

```json
{
  "tool_code": "trackio.cli_retrieve_metrics(experiment_id='my_latest_model_run', metric_name='f1_score', aggregate='average')"
}
```

Example output

```json
{
  "status": "success",
  "message": "Metric 'validation_accuracy' logged successfully to Trackio for step 15."
}
```

```json
{
  "status": "success",
  "message": "Alert for 'training_loss' configured successfully."
}
```

```json
{
  "metrics": [
    {
      "experiment_id": "my_latest_model_run",
      "metric_name": "f1_score",
      "value": 0.885,
      "aggregate_type": "average"
    }
  ],
  "status": "success"
}
```

When to use this skill

  • Use this skill when an AI agent is involved in the training, fine-tuning, or deployment of Machine Learning models and requires robust experiment tracking. It's ideal when real-time visibility into metrics, automated alerting for critical events, and structured retrieval of experiment data are essential.

When not to use this skill

  • Do not use this skill for tasks unrelated to Machine Learning experiment tracking or when a simpler, less feature-rich logging mechanism is sufficient for non-ML related activities. It is also not suitable if the AI agent does not have access to the necessary Python environment or Hugging Face Spaces integration.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/hugging-face-trackio/SKILL.md --create-dirs "https://raw.githubusercontent.com/sickn33/antigravity-awesome-skills/main/plugins/antigravity-awesome-skills-claude/skills/hugging-face-trackio/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/hugging-face-trackio/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How hugging-face-trackio Compares

Feature / Agenthugging-face-trackioStandard Approach
Platform SupportClaudeLimited / Varies
Context Awareness High Baseline
Installation ComplexitymediumN/A

Frequently Asked Questions

What does this skill do?

Track ML experiments with Trackio using Python logging, alerts, and CLI metric retrieval.

Which AI agents support this skill?

This skill is designed for Claude.

How difficult is it to install?

The installation complexity is rated as medium. You can find the installation instructions above.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Trackio - Experiment Tracking for ML Training

Trackio is an experiment tracking library for logging and visualizing ML training metrics. It syncs to Hugging Face Spaces for real-time monitoring dashboards.

## Three Interfaces

| Task | Interface | Reference |
|------|-----------|-----------|
| **Logging metrics** during training | Python API | [references/logging_metrics.md](references/logging_metrics.md) |
| **Firing alerts** for training diagnostics | Python API | [references/alerts.md](references/alerts.md) |
| **Retrieving metrics & alerts** after/during training | CLI | [references/retrieving_metrics.md](references/retrieving_metrics.md) |

## When to Use Each

### Python API → Logging

Use `import trackio` in your training scripts to log metrics:

- Initialize tracking with `trackio.init()`
- Log metrics with `trackio.log()` or use TRL's `report_to="trackio"`
- Finalize with `trackio.finish()`

**Key concept**: For remote/cloud training, pass `space_id` — metrics sync to a Space dashboard so they persist after the instance terminates.

→ See [references/logging_metrics.md](references/logging_metrics.md) for setup, TRL integration, and configuration options.

### Python API → Alerts

Insert `trackio.alert()` calls in training code to flag important events — like inserting print statements for debugging, but structured and queryable:

- `trackio.alert(title="...", level=trackio.AlertLevel.WARN)` — fire an alert
- Three severity levels: `INFO`, `WARN`, `ERROR`
- Alerts are printed to terminal, stored in the database, shown in the dashboard, and optionally sent to webhooks (Slack/Discord)

**Key concept for LLM agents**: Alerts are the primary mechanism for autonomous experiment iteration. An agent should insert alerts into training code for diagnostic conditions (loss spikes, NaN gradients, low accuracy, training stalls). Since alerts are printed to the terminal, an agent that is watching the training script's output will see them automatically. For background or detached runs, the agent can poll via CLI instead.

→ See [references/alerts.md](references/alerts.md) for the full alerts API, webhook setup, and autonomous agent workflows.

### CLI → Retrieving

Use the `trackio` command to query logged metrics and alerts:

- `trackio list projects/runs/metrics` — discover what's available
- `trackio get project/run/metric` — retrieve summaries and values
- `trackio list alerts --project <name> --json` — retrieve alerts
- `trackio show` — launch the dashboard
- `trackio sync` — sync to HF Space

**Key concept**: Add `--json` for programmatic output suitable for automation and LLM agents.

→ See [references/retrieving_metrics.md](references/retrieving_metrics.md) for all commands, workflows, and JSON output formats.

## Minimal Logging Setup

```python
import trackio

trackio.init(project="my-project", space_id="username/trackio")
trackio.log({"loss": 0.1, "accuracy": 0.9})
trackio.log({"loss": 0.09, "accuracy": 0.91})
trackio.finish()
```

### Minimal Retrieval

```bash
trackio list projects --json
trackio get metric --project my-project --run my-run --metric loss --json
```

## Autonomous ML Experiment Workflow

When running experiments autonomously as an LLM agent, the recommended workflow is:

1. **Set up training with alerts** — insert `trackio.alert()` calls for diagnostic conditions
2. **Launch training** — run the script in the background
3. **Poll for alerts** — use `trackio list alerts --project <name> --json --since <timestamp>` to check for new alerts
4. **Read metrics** — use `trackio get metric ...` to inspect specific values
5. **Iterate** — based on alerts and metrics, stop the run, adjust hyperparameters, and launch a new run

```python
import trackio

trackio.init(project="my-project", config={"lr": 1e-4})

for step in range(num_steps):
    loss = train_step()
    trackio.log({"loss": loss, "step": step})

    if step > 100 and loss > 5.0:
        trackio.alert(
            title="Loss divergence",
            text=f"Loss {loss:.4f} still high after {step} steps",
            level=trackio.AlertLevel.ERROR,
        )
    if step > 0 and abs(loss) < 1e-8:
        trackio.alert(
            title="Vanishing loss",
            text="Loss near zero — possible gradient collapse",
            level=trackio.AlertLevel.WARN,
        )

trackio.finish()
```

Then poll from a separate terminal/process:

```bash
trackio list alerts --project my-project --json --since "2025-01-01T00:00:00"
```

Related Skills

hugging-face-jobs

31392
from sickn33/antigravity-awesome-skills

Run workloads on Hugging Face Jobs with managed CPUs, GPUs, TPUs, secrets, and Hub persistence.

Machine LearningClaude

hugging-face-cli

31392
from sickn33/antigravity-awesome-skills

Use the Hugging Face Hub CLI (`hf`) to download, upload, and manage models, datasets, and Spaces.

Machine LearningClaude

hugging-face-vision-trainer

31392
from sickn33/antigravity-awesome-skills

Train or fine-tune vision models on Hugging Face Jobs for detection, classification, and SAM or SAM2 segmentation.

Computer VisionClaude

hugging-face-tool-builder

31392
from sickn33/antigravity-awesome-skills

Your purpose is now is to create reusable command line scripts and utilities for using the Hugging Face API, allowing chaining, piping and intermediate processing where helpful. You can access the API directly, as well as use the hf command line tool.

Developer ToolsClaude

hugging-face-papers

31392
from sickn33/antigravity-awesome-skills

Read and analyze Hugging Face paper pages or arXiv papers with markdown and papers API metadata.

Text AnalysisClaude

hugging-face-paper-publisher

31392
from sickn33/antigravity-awesome-skills

Publish and manage research papers on Hugging Face Hub. Supports creating paper pages, linking papers to models/datasets, claiming authorship, and generating professional markdown-based research articles.

AI Research PublishingClaude

hugging-face-model-trainer

31392
from sickn33/antigravity-awesome-skills

Train or fine-tune TRL language models on Hugging Face Jobs, including SFT, DPO, GRPO, and GGUF export.

AI Development & Self-ImprovementClaude

hugging-face-evaluation

31392
from sickn33/antigravity-awesome-skills

Add and manage evaluation results in Hugging Face model cards. Supports extracting eval tables from README content, importing scores from Artificial Analysis API, and running custom model evaluations with vLLM/lighteval. Works with the model-index metadata format.

Model ManagementClaude

hugging-face-datasets

31392
from sickn33/antigravity-awesome-skills

Create and manage datasets on Hugging Face Hub. Supports initializing repos, defining configs/system prompts, streaming row updates, and SQL-based dataset querying/transformation. Designed to work alongside HF MCP server for comprehensive dataset workflows.

Data ManagementClaude

hugging-face-dataset-viewer

31392
from sickn33/antigravity-awesome-skills

Query Hugging Face datasets through the Dataset Viewer API for splits, rows, search, filters, and parquet links.

Data Access & ExplorationClaude

hugging-face-community-evals

31392
from sickn33/antigravity-awesome-skills

Run local evaluations for Hugging Face Hub models with inspect-ai or lighteval.

Model Evaluation & MLOpsClaude

hugging-face-gradio

31355
from sickn33/antigravity-awesome-skills

Build or edit Gradio apps, layouts, components, and chat interfaces in Python.

ML ToolsClaude