log-analyzer

Upload plain-text log files (nginx, Apache, JSON Lines, syslog), parse them into SQLite, query entries with filters, visualize request and error rates over time, and fire threshold-based alerts.

7 stars

byheldernoid

View on GitHub Installation ↓

Best use case

log-analyzer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Upload plain-text log files (nginx, Apache, JSON Lines, syslog), parse them into SQLite, query entries with filters, visualize request and error rates over time, and fire threshold-based alerts.

Teams using log-analyzer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/log-analyzer/SKILL.md --create-dirs "https://raw.githubusercontent.com/heldernoid/agentic-build-templates/main/projects/data-analytics/log-analyzer/skills/log-analyzer/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/log-analyzer/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How log-analyzer Compares

Feature / Agent	log-analyzer	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Upload plain-text log files (nginx, Apache, JSON Lines, syslog), parse them into SQLite, query entries with filters, visualize request and error rates over time, and fire threshold-based alerts.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# log-analyzer Skill

## Overview

log-analyzer is a self-hosted single-process application. Files are uploaded via multipart POST, detected/parsed in a streaming background pipeline, and stored as rows in the `log_entries` SQLite table. The React client fetches entries with filter parameters, renders time-series charts via recharts, and shows stats aggregations. A `node-cron` scheduler evaluates alert rules every minute and inserts `alert_events` rows when thresholds are breached.

## TypeScript Interfaces

```typescript
interface LogFile {
  id: number;
  filename: string;       // original uploaded filename
  stored_name: string;    // nanoid(12) filename on disk
  format: 'nginx' | 'apache' | 'json-lines' | 'syslog';
  size_bytes: number;
  entry_count: number;
  error_count: number;    // unparseable lines
  status: 'parsing' | 'ready' | 'error';
  error_message: string | null;
  uploaded_at: string;    // ISO 8601
  parsed_at: string | null;
}

interface LogEntry {
  id: number;
  file_id: number;
  line_num: number;
  ts: string;             // ISO 8601
  level: 'ERROR' | 'WARN' | 'INFO' | 'DEBUG' | null;
  method: string | null;  // HTTP method
  path: string | null;
  status: number | null;
  bytes: number | null;
  ip: string | null;
  user_agent: string | null;
  message: string | null;
  raw: string;            // original log line
}

interface TimeSeriesBucket {
  bucket: string;         // ISO 8601 truncated to bucket size
  count: number;
  errors: number;         // 5xx count
  bytes_total: number;
}

interface StatusDist {
  status: number;
  count: number;
}

interface TopPath {
  path: string;
  count: number;
  error_count: number;
  avg_bytes: number;
}

interface AlertRule {
  id: number;
  name: string;
  file_id: number | null;
  metric: 'error_rate' | 'request_rate' | 'status_count' | 'keyword';
  operator: 'gt' | 'lt' | 'gte' | 'lte';
  threshold: number;
  window_minutes: number;
  cooldown_minutes: number;
  enabled: number;        // 0 or 1
  last_fired_at: string | null;
  created_at: string;
}

interface AlertEvent {
  id: number;
  rule_id: number;
  fired_at: string;
  metric_value: number;
  message: string;
}
```

## API Reference

### Log Files

**List all files**
```bash
curl http://localhost:3000/api/log-files
```
Response: `{ files: LogFile[] }`

**Upload a file**
```bash
curl -X POST http://localhost:3000/api/log-files \
  -F "file=@nginx-prod-2026-03-20.log"
```
Response: `{ file: LogFile }` with `status: 'parsing'`

Format auto-detection runs on first 10 non-empty lines. Can be overridden:
```bash
curl -X POST http://localhost:3000/api/log-files \
  -F "file=@access.log" \
  -F "format=nginx"
```

**Poll parse status**
```bash
curl http://localhost:3000/api/log-files/1/status
```
Response: `{ status: 'parsing' | 'ready' | 'error', entry_count: number, error_count: number }`

**Delete a file**
```bash
curl -X DELETE http://localhost:3000/api/log-files/1
```
Deletes the file on disk, all log entries (ON DELETE CASCADE), and the metadata row.

### Log Entries

**List entries with filters**
```bash
# Basic paginated fetch
curl "http://localhost:3000/api/log-files/1/entries?limit=500&offset=0"

# Filter by level and status
curl "http://localhost:3000/api/log-files/1/entries?level=ERROR&level=WARN&status_gte=400&status_lte=599"

# Time range
curl "http://localhost:3000/api/log-files/1/entries?ts_from=2026-03-20T03:00:00Z&ts_to=2026-03-20T04:00:00Z"

# Path contains + IP
curl "http://localhost:3000/api/log-files/1/entries?path_contains=auth&ip=10.0.1.45"

# Keyword search
curl "http://localhost:3000/api/log-files/1/entries?q=OutOfMemoryError"
```

Query parameters:
- `limit` (int, default 500, max 5000)
- `offset` (int, default 0)
- `level` (repeatable: ERROR, WARN, INFO, DEBUG)
- `method` (GET, POST, PUT, DELETE, PATCH)
- `status_gte`, `status_lte` (int)
- `ts_from`, `ts_to` (ISO 8601)
- `ip` (exact string match)
- `path_contains` (LIKE %value%)
- `q` (keyword: LIKE on message + raw)

Response: `{ entries: LogEntry[], total: number, offset: number, limit: number }`

### Stats Aggregations

**Time series**
```bash
curl "http://localhost:3000/api/log-files/1/stats/timeseries?bucket=5m&ts_from=2026-03-20T00:00:00Z&ts_to=2026-03-21T00:00:00Z"
```
Response: `{ buckets: TimeSeriesBucket[] }`

Bucket values: `1m`, `5m`, `15m`, `1h`, `6h`, `1d`

**Status distribution**
```bash
curl http://localhost:3000/api/log-files/1/stats/status-dist
```
Response: `{ distribution: StatusDist[] }`

**Top paths**
```bash
curl "http://localhost:3000/api/log-files/1/stats/top-paths?limit=10"
```
Response: `{ paths: TopPath[] }`

**Top IPs**
```bash
curl "http://localhost:3000/api/log-files/1/stats/top-ips?limit=10"
```
Response: `{ ips: Array<{ ip: string; count: number; error_count: number }> }`

**Level distribution**
```bash
curl http://localhost:3000/api/log-files/1/stats/level-dist
```
Response: `{ distribution: Array<{ level: string | null; count: number }> }`

### Alert Rules

**List rules**
```bash
curl http://localhost:3000/api/alert-rules
```
Response: `{ rules: AlertRule[] }`

**Create rule**
```bash
curl -X POST http://localhost:3000/api/alert-rules \
  -H "Content-Type: application/json" \
  -d '{
    "name": "High 5xx error rate",
    "file_id": null,
    "metric": "error_rate",
    "operator": "gt",
    "threshold": 30,
    "window_minutes": 5,
    "cooldown_minutes": 15,
    "enabled": 1
  }'
```

**Update rule**
```bash
curl -X PATCH http://localhost:3000/api/alert-rules/1 \
  -H "Content-Type: application/json" \
  -d '{"threshold": 50, "enabled": 0}'
```

**Test rule (evaluate without firing)**
```bash
curl -X POST http://localhost:3000/api/alert-rules/1/test
```
Response: `{ metric_value: number, would_fire: boolean, rule: AlertRule }`

**Delete rule**
```bash
curl -X DELETE http://localhost:3000/api/alert-rules/1
```

### Alert Events

**List events**
```bash
curl http://localhost:3000/api/alert-events
curl "http://localhost:3000/api/alert-events?rule_id=1&limit=50"
```
Response: `{ events: AlertEvent[] }`

**Clear all events**
```bash
curl -X DELETE http://localhost:3000/api/alert-events
```

### Settings

**Get settings**
```bash
curl http://localhost:3000/api/settings
```
Response: `{ settings: { row_cap: string; alert_poll_interval: string; max_upload_mb: string; default_time_bucket: string } }`

**Update settings**
```bash
curl -X PATCH http://localhost:3000/api/settings \
  -H "Content-Type: application/json" \
  -d '{"row_cap": "1000", "default_time_bucket": "15m"}'
```

## Ingestion Pipeline

The ingestion pipeline runs in a detached async IIFE after the upload route returns the initial `parsing` response:

```typescript
async function ingestFile(fileId: number, filePath: string, format: LogFormat) {
  const db = getDb();
  const update = db.prepare(`UPDATE log_files SET status=?, entry_count=?, error_count=?, parsed_at=?, error_message=? WHERE id=?`);
  const insert = db.prepare(`INSERT INTO log_entries (file_id, line_num, ts, level, method, path, status, bytes, ip, user_agent, message, raw) VALUES (?,?,?,?,?,?,?,?,?,?,?,?)`);
  const batchInsert = db.transaction((batch: LogEntry[]) => {
    for (const e of batch) insert.run(e.file_id, e.line_num, e.ts, e.level, e.method, e.path, e.status, e.bytes, e.ip, e.user_agent, e.message, e.raw);
  });

  const parser = getParser(format);
  const rl = readline.createInterface({ input: createReadStream(filePath), crlfDelay: Infinity });

  let lineNum = 0;
  let entryCount = 0;
  let errorCount = 0;
  let batch: LogEntry[] = [];

  for await (const line of rl) {
    lineNum++;
    const entry = parser.parse(line, lineNum);
    if (entry) {
      batch.push({ ...entry, file_id: fileId });
      entryCount++;
    } else {
      errorCount++;
    }
    if (batch.length >= 1000) {
      batchInsert(batch);
      batch = [];
    }
  }
  if (batch.length > 0) batchInsert(batch);

  update.run('ready', entryCount, errorCount, new Date().toISOString(), null, fileId);
}
```

## Alert Evaluation

The scheduler runs every 60 seconds (configurable). For each enabled rule:

1. Check cooldown: if `last_fired_at` is within `cooldown_minutes`, skip.
2. Compute metric value using `evaluateRule(db, rule)`.
3. Compare `metric_value` against `threshold` using `operator`.
4. If threshold breached: insert `alert_events` row, update `last_fired_at`.

```typescript
function compare(value: number, operator: string, threshold: number): boolean {
  switch (operator) {
    case 'gt': return value > threshold;
    case 'lt': return value < threshold;
    case 'gte': return value >= threshold;
    case 'lte': return value <= threshold;
    default: return false;
  }
}
```

## Error Codes

| HTTP | Code | Meaning |
|------|------|---------|
| 400 | BAD_REQUEST | Missing required field |
| 404 | FILE_NOT_FOUND | No log file with that id |
| 404 | RULE_NOT_FOUND | No alert rule with that id |
| 409 | FILE_NOT_READY | File is still being parsed |
| 413 | FILE_TOO_LARGE | Upload exceeds max_upload_mb |
| 422 | VALIDATION_ERROR | Zod schema failure |
| 500 | INTERNAL_ERROR | Unexpected server error |

## Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `PORT` | `3000` | HTTP port |
| `DATA_DIR` | `./data` | Storage for app.db and uploaded files |
| `NODE_ENV` | `development` | `development` or `production` |
| `ALERT_POLL_INTERVAL` | `60` | Alert poll interval in seconds |
| `MAX_UPLOAD_MB` | `500` | Max upload size in MB |
| `DEFAULT_ROW_CAP` | `500` | Default entries per page |

## Docker

```yaml
services:
  app:
    build: .
    ports:
      - "3000:3000"
    volumes:
      - log_data:/app/data
    environment:
      NODE_ENV: production
      DATA_DIR: /app/data

volumes:
  log_data:
```

Related Skills

Skill: Uptime Monitoring

from heldernoid/agentic-build-templates

## Overview

Skill: Status Page

from heldernoid/agentic-build-templates

## Overview

Skill: unit-conversion

from heldernoid/agentic-build-templates

## Overview

Skill: recipe-scaler

from heldernoid/agentic-build-templates

## Overview

reading-list

from heldernoid/agentic-build-templates

Operate the reading-list API to save, manage, tag, search, and export articles.

email-digest

from heldernoid/agentic-build-templates

Configure, test, and troubleshoot the reading-list daily email digest delivered via nodemailer.

websocket-realtime

from heldernoid/agentic-build-templates

Use the WebSocket connection in poll-builder to receive live vote updates. Use when you need to stream real-time poll results, monitor a poll for new votes, or build a live dashboard. Triggers include "live results", "real-time updates", "stream votes", "watch poll", or "WebSocket".

poll-builder

from heldernoid/agentic-build-templates

Self-hosted poll creation tool with real-time results. Use when you need to create a poll, check vote counts, close a poll, export results, or get the shareable link for a poll. Triggers include "create poll", "vote", "poll results", "survey", "collect votes", "share poll", or any task involving polling or voting.

Skill: personal-finance

from heldernoid/agentic-build-templates

## Overview

Skill: csv-import

from heldernoid/agentic-build-templates

## Overview

Skill: Syntax Highlighting

from heldernoid/agentic-build-templates

## Purpose

Skill: Pastebin Core

from heldernoid/agentic-build-templates

## Purpose