tinker

Fine-tune LLMs using the Tinker API. Covers supervised fine-tuning, reinforcement learning, LoRA training, vision-language models, and both high-level Cookbook patterns and low-level API usage.

148 stars

Best use case

tinker is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Fine-tune LLMs using the Tinker API. Covers supervised fine-tuning, reinforcement learning, LoRA training, vision-language models, and both high-level Cookbook patterns and low-level API usage.

Teams using tinker should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/tinker/SKILL.md --create-dirs "https://raw.githubusercontent.com/sundial-org/skills/main/skills/tinker/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/tinker/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How tinker Compares

Feature / AgenttinkerStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Fine-tune LLMs using the Tinker API. Covers supervised fine-tuning, reinforcement learning, LoRA training, vision-language models, and both high-level Cookbook patterns and low-level API usage.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Tinker API - LLM Fine-Tuning

## Overview

Tinker is a training API for large language models from Thinking Machines Lab. It provides:

- **Supervised Fine-Tuning (SFT)**: Train models on instruction/completion pairs
- **Reinforcement Learning (RL)**: PPO and policy gradient losses; cookbook patterns include GRPO-like group rollouts/advantage centering
- **Vision-Language Models**: VLM support via Qwen3-VL
- **LoRA Training**: Efficient parameter-efficient fine-tuning

Two abstraction levels:
- **Tinker Cookbook**: High-level patterns with automatic training loops
- **Low-Level API**: Manual control for custom training logic

## Quick Reference

| Topic | Reference |
|-------|-----------|
| Setup & Core Concepts | [Getting Started](references/getting-started.md) |
| API Classes & Types | [API Reference](references/api-reference.md) |
| Supervised Learning | [Supervised Learning](references/supervised-learning.md) |
| RL Training | [Reinforcement Learning](references/reinforcement-learning.md) |
| Loss Functions | [Loss Functions](references/loss-functions.md) |
| Chat Templates | [Rendering](references/rendering.md) |
| Models & LoRA | [Models & LoRA](references/models-and-lora.md) |
| Example Scripts | [Recipes](references/recipes.md) |

## Installation

```bash
pip install tinker tinker-cookbook
export TINKER_API_KEY=your_api_key_here
```

## Minimal Example

```python
import numpy as np
import tinker
from tinker import types

# Create clients
service_client = tinker.ServiceClient()
training_client = service_client.create_lora_training_client(
    base_model="Qwen/Qwen3-30B-A3B", rank=32
)
tokenizer = training_client.get_tokenizer()

# Prepare data
prompt = "English: hello\nPig Latin:"
completion = " ello-hay\n"
prompt_tokens = tokenizer.encode(prompt, add_special_tokens=True)
completion_tokens = tokenizer.encode(completion, add_special_tokens=False)
tokens = prompt_tokens + completion_tokens
weights = np.array(([0] * len(prompt_tokens)) + ([1] * len(completion_tokens)), dtype=np.float32)
target_tokens = np.array(tokens[1:], dtype=np.int64)

datum = types.Datum(
    model_input=types.ModelInput.from_ints(tokens=tokens[:-1]),
    loss_fn_inputs={
        "target_tokens": target_tokens,
        "weights": weights[1:]
    }
)

# Train
fwdbwd = training_client.forward_backward([datum], "cross_entropy")
optim = training_client.optim_step(types.AdamParams(learning_rate=1e-4))
fwdbwd.result(); optim.result()

# Sample
sampling_client = training_client.save_weights_and_get_sampling_client(name="v1")
result = sampling_client.sample(
    prompt=types.ModelInput.from_ints(tokens=tokenizer.encode("English: world\nPig Latin:", add_special_tokens=True)),
    sampling_params=types.SamplingParams(max_tokens=20),
    num_samples=1
).result()
print(tokenizer.decode(result.sequences[0].tokens))
```

## Common Imports

```python
# Low-level API
import tinker
from tinker import types
from tinker.types import Datum, ModelInput, TensorData, AdamParams, SamplingParams

# Cookbook (high-level)
import chz
import asyncio
from tinker_cookbook.supervised import train
from tinker_cookbook.supervised.types import ChatDatasetBuilder, ChatDatasetBuilderCommonConfig
from tinker_cookbook.supervised.data import (
    SupervisedDatasetFromHFDataset,
    StreamingSupervisedDatasetFromHFDataset,
    FromConversationFileBuilder,
    conversation_to_datum,
)
from tinker_cookbook.renderers import get_renderer, TrainOnWhat
from tinker_cookbook.model_info import get_recommended_renderer_name
from tinker_cookbook.tokenizer_utils import get_tokenizer
```

## When to Use What

| Scenario | Approach |
|----------|----------|
| Standard SFT with HF/JSONL data | Cookbook `ChatDatasetBuilder` + `tinker_cookbook.supervised.train.main()` |
| Custom preprocessing | Custom `SupervisedDataset` class |
| Large datasets (>1M) | `StreamingSupervisedDatasetFromHFDataset` |
| RL / GRPO | Cookbook RL patterns |
| Research / custom loops | Low-level `forward_backward()` + `optim_step()` |
| Vision-language | Qwen3-VL + `ImageChunk` |

## External Resources

- Documentation: https://tinker-docs.thinkingmachines.ai/
- Cookbook Repo: https://github.com/thinking-machines-lab/tinker-cookbook
- Console: https://tinker-console.thinkingmachines.ai

Related Skills

tinker-training-cost

148
from sundial-org/skills

Calculate training costs for Tinker fine-tuning jobs. Use when estimating costs for Tinker LLM training, counting tokens in datasets, or comparing Tinker model training prices. Tokenizes datasets using the correct model tokenizer and provides accurate cost estimates.

cs448b-visualization

148
from sundial-org/skills

Data visualization design based on Stanford CS448B. Use for: (1) choosing chart types, (2) selecting visual encodings, (3) critiquing visualizations, (4) building D3.js visualizations, (5) designing interactions/animations, (6) choosing colors, (7) visualizing networks, (8) visualizing text. Covers Bertin, Mackinlay, Cleveland & McGill.

training-data-curation

148
from sundial-org/skills

Guidelines for creating high-quality datasets for LLM post-training (SFT/DPO/RLHF). Use when preparing data for fine-tuning, evaluating data quality, or designing data collection strategies.

skill

148
from sundial-org/skills

Find, install, create, improve, and publish AI agent skills through the Sundial ecosystem. Use when the user wants to find or search for skills, install a skill, create a new skill, improve or evaluate an existing skill, or publish a skill to Sundial Hub. Trigger phrases include "find a skill", "install skill", "create a skill", "make a skill", "improve this skill", "evaluate skill", "publish skill", "push skill", "search for skills".

skill-to-card

148
from sundial-org/skills

End-to-end workflow that creates a skill from a description and attached files, publishes it to Sundial as a private skill, generates a trading card (front + back with QR code), and sends it to a printer. Use when the user wants to create a skill and get a printed trading card, or says "skill to card", "create and print a skill card", "make me a skill with a card".

project-referee

148
from sundial-org/skills

Critiques ML conference papers with reviewer-style feedback. Use when users want to anticipate reviewer concerns, identify weaknesses, check claim-evidence gaps, or find missing citations.

neuro-symbolic-reasoning

148
from sundial-org/skills

Neuro-symbolic AI combining LLMs with symbolic solvers. Use when exploring neuro-symbolic approaches (ideation, no code) or implementing solver integrations (code).

icml-reviewer

148
from sundial-org/skills

Paper reviewer that evaluates machine learning research projects following official ICML reviewer guidelines. Provides comprehensive reviews with actionable feedback across all key dimensions: claims/evidence, relation to prior work, originality, significance, clarity, and reproducibility. Also provides formative feedback on incomplete drafts, proposals, and research code repositories. MANDATORY TRIGGERS: review paper, ICML review, paper review, evaluate paper, research paper feedback, ML paper review, conference review, academic review, paper critique, NeurIPS review, ICLR review, project proposal, research proposal, paper draft, early feedback, incomplete paper, work in progress, WIP review, review repo, review codebase, research project review

cs-research-methodology

148
from sundial-org/skills

Conduct a literature review and develop a CS research proposal. Use when asked to review a research area, find gaps in existing work, and propose a novel research contribution. The output is a research proposal identifying an assumption to challenge (the "bit flip") and how to validate it.

commit-splitter

148
from sundial-org/skills

Split large sets of uncommitted changes into logical, well-organized commits. Use when the user has many uncommitted changes and wants structured commits, or proactively suggest when detecting a large diff that would benefit from splitting.

codex

148
from sundial-org/skills

Run OpenAI's Codex CLI agent in non-interactive mode using `codex exec`. Use when delegating coding tasks to Codex, running Codex in scripts/automation, or when needing a second agent to work on a task in parallel.

ai-co-scientist

148
from sundial-org/skills

Transform Claude Code into an AI Scientist that orchestrates research workflows using tree-based hypothesis exploration. Triggers on "research project", "scientific experiment", "run experiments", "AI scientist", "tree search experimentation", "systematic study".