Best use case
Replicate is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
## Installation
Teams using Replicate should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/replicate/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How Replicate Compares
| Feature / Agent | Replicate | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
## Installation
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Replicate
## Installation
```bash
# Install Python client
pip install replicate
# Set API token
export REPLICATE_API_TOKEN="r8_xxxxxxxxxxxx"
```
## Run a Model
```python
# run_model.py — Run a model and get the output
import replicate
# Run Stable Diffusion XL
output = replicate.run(
"stability-ai/sdxl:7762fd07cf82c948538e41f63f77d685e02b063e37e496e96eefd46c929f9bdc",
input={
"prompt": "A futuristic cityscape at sunset, digital art",
"negative_prompt": "blurry, low quality",
"width": 1024,
"height": 1024,
"num_outputs": 1,
},
)
# Output is a list of URLs
for url in output:
print(url)
```
## Run Language Models
```python
# run_llm.py — Run open-source LLMs via Replicate
import replicate
# Run Llama with streaming
for event in replicate.stream(
"meta/llama-2-70b-chat",
input={
"prompt": "Explain machine learning to a 5-year-old",
"system_prompt": "You are a friendly teacher.",
"max_new_tokens": 500,
"temperature": 0.7,
},
):
print(str(event), end="", flush=True)
```
## Async Predictions
```python
# async_prediction.py — Submit a prediction and poll for results
import replicate
import time
# Create prediction without waiting
prediction = replicate.predictions.create(
model="stability-ai/sdxl",
input={"prompt": "A cat in space"},
)
print(f"Prediction ID: {prediction.id}")
print(f"Status: {prediction.status}")
# Poll for completion
while prediction.status not in ("succeeded", "failed", "canceled"):
time.sleep(2)
prediction.reload()
print(f"Status: {prediction.status}")
if prediction.status == "succeeded":
print(f"Output: {prediction.output}")
```
## Webhooks
```python
# webhook_prediction.py — Get notified when a prediction completes via webhook
import replicate
prediction = replicate.predictions.create(
model="stability-ai/sdxl",
input={"prompt": "A mountain landscape"},
webhook="https://myapp.com/api/replicate-webhook",
webhook_events_filter=["completed"],
)
print(f"Prediction started: {prediction.id}")
```
## Fine-Tuning
```python
# fine_tune.py — Fine-tune SDXL on custom images
import replicate
# Create a fine-tune training
training = replicate.trainings.create(
version="stability-ai/sdxl:7762fd07cf82c948538e41f63f77d685e02b063e37e496e96eefd46c929f9bdc",
input={
"input_images": "https://my-bucket.s3.amazonaws.com/training-images.zip",
"token_string": "TOK",
"caption_prefix": "a photo of TOK, ",
"max_train_steps": 1000,
"use_face_detection_instead": False,
},
destination="my-username/my-sdxl-model",
)
print(f"Training ID: {training.id}")
print(f"Status: {training.status}")
# Check training status
training.reload()
print(f"Status: {training.status}")
print(f"Model version: {training.output.get('version')}")
```
## Deploy Custom Models with Cog
```dockerfile
# cog.yaml — Define model environment for Cog packaging
build:
python_version: "3.11"
python_packages:
- torch==2.1.0
- transformers==4.36.0
gpu: true
predict: "predict.py:Predictor"
```
```python
# predict.py — Cog predictor class for custom model deployment
from cog import BasePredictor, Input, Path
from transformers import pipeline
class Predictor(BasePredictor):
def setup(self):
"""Load model into memory during container startup"""
self.pipe = pipeline("text-generation", model="./model", device=0)
def predict(
self,
prompt: str = Input(description="Input text prompt"),
max_tokens: int = Input(description="Max tokens to generate", default=100, ge=1, le=1000),
temperature: float = Input(description="Sampling temperature", default=0.7, ge=0, le=2),
) -> str:
"""Run a single prediction"""
output = self.pipe(prompt, max_new_tokens=max_tokens, temperature=temperature)
return output[0]["generated_text"]
```
```bash
# Build and push a custom model with Cog
pip install cog
# Test locally
cog predict -i prompt="Hello world"
# Push to Replicate
cog login
cog push r8.im/my-username/my-model
```
## Node.js Client
```typescript
// replicate_node.ts — Use Replicate from Node.js
import Replicate from "replicate";
const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN });
const output = await replicate.run("stability-ai/sdxl", {
input: {
prompt: "A watercolor painting of a robot",
width: 1024,
height: 1024,
},
});
console.log(output);
```
## Key Concepts
- **Version pinning**: Models are versioned by SHA — pin versions for reproducibility
- **Cold starts**: First request to a model may take 10-60s to boot; subsequent calls are faster
- **Streaming**: Use `replicate.stream()` for real-time token output from language models
- **Cog**: Open-source tool to package ML models as Docker containers for Replicate
- **Webhooks**: Avoid polling by receiving HTTP callbacks when predictions complete
- **Pricing**: Pay per second of compute; GPU type depends on the modelRelated Skills
replicate-handler
Integrate with Replicate AI for running models (image generation, LLMs, etc.).
Replicate Automation
Automate Replicate AI model operations -- run predictions, upload files, inspect model schemas, list versions, and manage prediction history via the Composio MCP integration.
Daily Logs
Record the user's daily activities, progress, decisions, and learnings in a structured, chronological format.
Socratic Method: The Dialectic Engine
This skill transforms Claude into a Socratic agent — a cognitive partner who guides
Sokratische Methode: Die Dialektik-Maschine
Dieser Skill verwandelt Claude in einen sokratischen Agenten — einen kognitiven Partner, der Nutzende durch systematisches Fragen zur Wissensentdeckung führt, anstatt direkt zu instruieren.
College Football Data (CFB)
Before writing queries, consult `references/api-reference.md` for endpoints, conference IDs, team IDs, and data shapes.
College Basketball Data (CBB)
Before writing queries, consult `references/api-reference.md` for endpoints, conference IDs, team IDs, and data shapes.
Betting Analysis
Before writing queries, consult `references/api-reference.md` for odds formats, command parameters, and key concepts.
Research Proposal Generator
Generate high-quality academic research proposals for PhD applications following Nature Reviews-style academic writing conventions.
Paper Slide Deck Generator
Transform academic papers and content into professional slide deck images with automatic figure extraction.
Medical Imaging AI Literature Review Skill
Write comprehensive literature reviews following a systematic 7-phase workflow.
Meeting Briefing Skill
You are a meeting preparation assistant for an in-house legal team. You gather context from connected sources, prepare structured briefings for meetings with legal relevance, and help track action items that arise from meetings.