llm-response-comparator Skill

## Quick Start

7 stars

Best use case

llm-response-comparator Skill is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

## Quick Start

Teams using llm-response-comparator Skill should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/llm-response-comparator/SKILL.md --create-dirs "https://raw.githubusercontent.com/heldernoid/agentic-build-templates/main/projects/ai-llm-tools/llm-response-comparator/skills/llm-response-comparator/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/llm-response-comparator/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How llm-response-comparator Skill Compares

Feature / Agentllm-response-comparator SkillStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

## Quick Start

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# llm-response-comparator Skill

## Quick Start

```bash
git clone <repo>
cd llm-response-comparator
cp .env.example .env
# edit .env -- set ENCRYPTION_KEY and optionally LLM provider base URLs
pnpm install
pnpm dev
# Web UI: http://localhost:5173
# API:    http://localhost:3000
```

## Project Tree

```
llm-response-comparator/
  packages/
    server/
      src/
        db/
          schema.ts          # better-sqlite3 setup, WAL mode, foreign keys
          migrations.ts      # CREATE TABLE statements
        llm/
          llmClient.ts       # unified interface for all providers
          providers/
            anthropic.ts
            openai.ts
            ollama.ts
            mistral.ts
            cohere.ts
          stream.ts          # EventEmitter-based SSE event bus
        routes/
          comparisons.ts     # POST /api/comparisons, GET /api/comparisons/:id
          responses.ts       # GET /api/responses/:id, PATCH (rating)
          models.ts          # GET/POST/PATCH/DELETE /api/models
          providers.ts       # GET/POST/DELETE /api/providers (key management)
          stream.ts          # GET /api/comparisons/:id/stream (SSE)
        middleware/
          auth.ts            # lrc_ API key validation
          error.ts
        index.ts
    web/
      src/
        components/
          ComparisonGrid.tsx
          ResponseCard.tsx   # pending/streaming/complete/error states
          WinnerBanner.tsx
          DiffView.tsx
          StarRating.tsx
          ModelSelector.tsx
        pages/
          NewComparison.tsx
          RunningComparison.tsx
          ResultDetail.tsx
          History.tsx
          Models.tsx
          Settings.tsx
          Analytics.tsx
        stores/
          comparisonStore.ts # Zustand
          settingsStore.ts
    shared/
      src/
        types.ts
  .env.example
  pnpm-workspace.yaml
```

## Environment Variables

| Variable | Required | Default | Description |
|---|---|---|---|
| `PORT` | no | `3000` | Express listen port |
| `DATABASE_PATH` | no | `./data/lrc.db` | SQLite file path |
| `ENCRYPTION_KEY` | yes | -- | 32-byte hex key for AES-256-GCM provider key encryption |
| `STREAM_TIMEOUT_MS` | no | `30000` | Abort SSE stream after N ms |
| `REQUIRE_AUTH` | no | `1` | Set to `0` to disable API key auth (dev only) |

Generate an encryption key:

```bash
node -e "console.log(require('node:crypto').randomBytes(32).toString('hex'))"
```

## REST API

### Create a comparison

```bash
curl -X POST http://localhost:3000/api/comparisons \
  -H "Authorization: Bearer lrc_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Fibonacci function quality",
    "prompt": "Write an efficient Fibonacci function in Python...",
    "systemMessage": "You are an expert software engineer.",
    "models": [
      {"provider": "anthropic", "modelId": "claude-3-5-sonnet-20241022"},
      {"provider": "anthropic", "modelId": "claude-3-5-haiku-20241022"},
      {"provider": "openai",    "modelId": "gpt-4o"},
      {"provider": "openai",    "modelId": "gpt-4o-mini"}
    ],
    "temperature": 0.7,
    "maxTokens": 1024
  }'
```

Response:

```json
{
  "comparisonId": "cmp_a1b2c3d4",
  "status": "running",
  "responseIds": [
    "resp_x1", "resp_x2", "resp_x3", "resp_x4"
  ]
}
```

### Stream comparison progress (SSE)

```bash
curl -N http://localhost:3000/api/comparisons/cmp_a1b2c3d4/stream \
  -H "Authorization: Bearer lrc_your_key"
```

SSE event types:

| Event | Data |
|---|---|
| `chunk` | `{responseId, delta}` -- streamed token delta |
| `done` | `{responseId, inputTokens, outputTokens, latencyMs}` |
| `error` | `{responseId, error}` -- provider error for one model |
| `comparison_complete` | `{comparisonId, status}` -- all models finished |

### Get comparison result

```bash
curl http://localhost:3000/api/comparisons/cmp_a1b2c3d4 \
  -H "Authorization: Bearer lrc_your_key"
```

### Rate a response

```bash
curl -X PATCH http://localhost:3000/api/responses/resp_x1 \
  -H "Authorization: Bearer lrc_your_key" \
  -H "Content-Type: application/json" \
  -d '{"rating": 5, "ratingComment": "Best docstring quality"}'
```

### Set winner

```bash
curl -X PATCH http://localhost:3000/api/comparisons/cmp_a1b2c3d4 \
  -H "Authorization: Bearer lrc_your_key" \
  -H "Content-Type: application/json" \
  -d '{"winnerResponseId": "resp_x1"}'
```

### List comparison history

```bash
curl "http://localhost:3000/api/comparisons?limit=20&offset=0&status=complete" \
  -H "Authorization: Bearer lrc_your_key"
```

### Manage provider keys

```bash
# Add or update a key
curl -X POST http://localhost:3000/api/providers \
  -H "Authorization: Bearer lrc_your_key" \
  -H "Content-Type: application/json" \
  -d '{"provider": "anthropic", "apiKey": "sk-ant-..."}'

# Test connection
curl http://localhost:3000/api/providers/anthropic/test \
  -H "Authorization: Bearer lrc_your_key"

# Remove key
curl -X DELETE http://localhost:3000/api/providers/anthropic \
  -H "Authorization: Bearer lrc_your_key"
```

### Model catalog

```bash
# List all models
curl http://localhost:3000/api/models \
  -H "Authorization: Bearer lrc_your_key"

# Add custom model
curl -X POST http://localhost:3000/api/models \
  -H "Authorization: Bearer lrc_your_key" \
  -H "Content-Type: application/json" \
  -d '{"provider": "ollama", "modelId": "gemma:2b", "displayName": "Gemma 2B"}'

# Enable/disable
curl -X PATCH http://localhost:3000/api/models/some-id \
  -H "Authorization: Bearer lrc_your_key" \
  -H "Content-Type: application/json" \
  -d '{"enabled": false}'
```

## Core TypeScript Interfaces

```typescript
// packages/shared/src/types.ts

export type Provider = 'openai' | 'anthropic' | 'ollama' | 'mistral' | 'cohere';

export interface ModelTarget {
  provider: Provider;
  modelId: string;
  displayName?: string;
}

export interface CreateComparisonRequest {
  name?: string;
  prompt: string;
  systemMessage?: string;
  models: ModelTarget[];
  temperature?: number;   // default 0.7
  maxTokens?: number;     // default 1024
  tags?: string[];
}

export type ResponseStatus = 'pending' | 'streaming' | 'complete' | 'error';
export type ComparisonStatus = 'running' | 'complete' | 'partial' | 'error';

export interface RunResult {
  responseId: string;
  provider: Provider;
  modelId: string;
  content: string;
  inputTokens: number;
  outputTokens: number;
  latencyMs: number;
  status: ResponseStatus;
  error?: string;
}
```

## Concurrency Pattern

All models run concurrently using `Promise.allSettled`. Never await models sequentially.

```typescript
// packages/server/src/llm/llmClient.ts

export async function runAll(
  targets: ModelTarget[],
  request: LLMRequest,
  emitter: EventEmitter
): Promise<RunResult[]> {
  const tasks = targets.map((target) =>
    runOne(target, request, emitter).catch((err) => ({
      ...target,
      status: 'error' as const,
      error: err.message,
      content: '',
      inputTokens: 0,
      outputTokens: 0,
      latencyMs: 0,
    }))
  );
  const results = await Promise.allSettled(tasks);
  return results.map((r) =>
    r.status === 'fulfilled' ? r.value : { ...r.reason }
  );
}
```

## Provider Key Encryption

Keys are encrypted at rest using AES-256-GCM. The encryption key comes from `ENCRYPTION_KEY` env var.

```typescript
// packages/server/src/db/crypto.ts
import { createCipheriv, createDecipheriv, randomBytes } from 'node:crypto';

const ALG = 'aes-256-gcm';

export function encrypt(plaintext: string, keyHex: string): string {
  const key = Buffer.from(keyHex, 'hex');
  const iv = randomBytes(12);
  const cipher = createCipheriv(ALG, key, iv);
  const encrypted = Buffer.concat([cipher.update(plaintext, 'utf8'), cipher.final()]);
  const tag = cipher.getAuthTag();
  // format: iv(12) + tag(16) + ciphertext -- base64 encoded
  return Buffer.concat([iv, tag, encrypted]).toString('base64');
}

export function decrypt(encoded: string, keyHex: string): string {
  const key = Buffer.from(keyHex, 'hex');
  const buf = Buffer.from(encoded, 'base64');
  const iv = buf.subarray(0, 12);
  const tag = buf.subarray(12, 28);
  const encrypted = buf.subarray(28);
  const decipher = createDecipheriv(ALG, key, iv);
  decipher.setAuthTag(tag);
  return decipher.update(encrypted) + decipher.final('utf8');
}
```

## API Key Format

Application API keys (for auth to the local REST API) use the `lrc_` prefix.

```typescript
// Generate a new API key
import { randomBytes, createHash } from 'node:crypto';

const raw = 'lrc_' + randomBytes(32).toString('hex');
const hashed = createHash('sha256').update(raw).digest('hex');
// Store hashed in DB, return raw once to user
```

## SQLite Schema

```sql
CREATE TABLE comparisons (
  id              TEXT PRIMARY KEY,
  name            TEXT,
  prompt          TEXT NOT NULL,
  system_message  TEXT,
  temperature     REAL NOT NULL DEFAULT 0.7,
  max_tokens      INTEGER NOT NULL DEFAULT 1024,
  status          TEXT NOT NULL DEFAULT 'running',
  winner_response_id TEXT REFERENCES responses(id),
  tags            TEXT,   -- JSON array string
  created_at      TEXT NOT NULL DEFAULT (datetime('now'))
);

CREATE TABLE responses (
  id              TEXT PRIMARY KEY,
  comparison_id   TEXT NOT NULL REFERENCES comparisons(id) ON DELETE CASCADE,
  provider        TEXT NOT NULL,
  model_id        TEXT NOT NULL,
  content         TEXT,
  status          TEXT NOT NULL DEFAULT 'pending',
  input_tokens    INTEGER,
  output_tokens   INTEGER,
  latency_ms      INTEGER,
  rating          INTEGER CHECK(rating BETWEEN 1 AND 5),
  rating_comment  TEXT,
  error           TEXT,
  created_at      TEXT NOT NULL DEFAULT (datetime('now'))
);

CREATE TABLE provider_keys (
  provider        TEXT PRIMARY KEY,
  encrypted_key   TEXT NOT NULL,
  base_url        TEXT,
  updated_at      TEXT NOT NULL DEFAULT (datetime('now'))
);

CREATE TABLE model_catalog (
  id              TEXT PRIMARY KEY,
  provider        TEXT NOT NULL,
  model_id        TEXT NOT NULL,
  display_name    TEXT NOT NULL,
  context_window  INTEGER,
  enabled         INTEGER NOT NULL DEFAULT 1,
  is_custom       INTEGER NOT NULL DEFAULT 0,
  UNIQUE(provider, model_id)
);

CREATE TABLE api_keys (
  id              TEXT PRIMARY KEY,
  key_hash        TEXT NOT NULL UNIQUE,
  label           TEXT,
  last_used_at    TEXT,
  created_at      TEXT NOT NULL DEFAULT (datetime('now'))
);
```

## Docker

```bash
docker run -d \
  --name lrc \
  -p 3000:3000 \
  -e ENCRYPTION_KEY=your_32_byte_hex_key \
  -v $(pwd)/data:/app/data \
  lrc:latest
```

docker-compose.yml:

```yaml
services:
  lrc:
    build: .
    ports:
      - "3000:3000"
    environment:
      ENCRYPTION_KEY: ${ENCRYPTION_KEY}
      DATABASE_PATH: /data/lrc.db
      STREAM_TIMEOUT_MS: 30000
    volumes:
      - ./data:/data
    restart: unless-stopped
```

## Backup and Restore

```bash
# Backup
cp ./data/lrc.db ./data/lrc.db.bak

# Restore
cp ./data/lrc.db.bak ./data/lrc.db

# Export all comparisons via API
curl http://localhost:3000/api/comparisons/export \
  -H "Authorization: Bearer lrc_your_key" \
  > comparisons_export.json
```

## Troubleshooting

| Symptom | Likely Cause | Fix |
|---|---|---|
| `401 Unauthorized` | Missing or wrong API key | Check `Authorization: Bearer lrc_...` header |
| Provider returns 401 | Wrong provider API key | Update key via Settings or `POST /api/providers` |
| Stream never fires `comparison_complete` | One model hung | Increase `STREAM_TIMEOUT_MS` or check provider status |
| `SQLITE_CANTOPEN` | Database directory missing | `mkdir -p ./data` |
| All responses show `error` | No provider keys configured | Add keys in Settings |
| Ollama models not listed | Ollama not running | Start with `ollama serve` |
| Latency very high for Ollama | Model not loaded | Run `ollama run <model>` once to load |

Related Skills

Skill: Uptime Monitoring

7
from heldernoid/agentic-build-templates

## Overview

Skill: Status Page

7
from heldernoid/agentic-build-templates

## Overview

Skill: unit-conversion

7
from heldernoid/agentic-build-templates

## Overview

Skill: recipe-scaler

7
from heldernoid/agentic-build-templates

## Overview

reading-list

7
from heldernoid/agentic-build-templates

Operate the reading-list API to save, manage, tag, search, and export articles.

email-digest

7
from heldernoid/agentic-build-templates

Configure, test, and troubleshoot the reading-list daily email digest delivered via nodemailer.

websocket-realtime

7
from heldernoid/agentic-build-templates

Use the WebSocket connection in poll-builder to receive live vote updates. Use when you need to stream real-time poll results, monitor a poll for new votes, or build a live dashboard. Triggers include "live results", "real-time updates", "stream votes", "watch poll", or "WebSocket".

poll-builder

7
from heldernoid/agentic-build-templates

Self-hosted poll creation tool with real-time results. Use when you need to create a poll, check vote counts, close a poll, export results, or get the shareable link for a poll. Triggers include "create poll", "vote", "poll results", "survey", "collect votes", "share poll", or any task involving polling or voting.

Skill: personal-finance

7
from heldernoid/agentic-build-templates

## Overview

Skill: csv-import

7
from heldernoid/agentic-build-templates

## Overview

Skill: Syntax Highlighting

7
from heldernoid/agentic-build-templates

## Purpose

Skill: Pastebin Core

7
from heldernoid/agentic-build-templates

## Purpose