ln-514-test-log-analyzer

Analyzes application logs: classifies errors, checks log quality, maps stack traces to source. Use when logs need review after test runs or during development.

310 stars

Best use case

ln-514-test-log-analyzer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Analyzes application logs: classifies errors, checks log quality, maps stack traces to source. Use when logs need review after test runs or during development.

Teams using ln-514-test-log-analyzer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ln-514-test-log-analyzer/SKILL.md --create-dirs "https://raw.githubusercontent.com/levnikolaevich/claude-code-skills/main/skills-catalog/ln-514-test-log-analyzer/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/ln-514-test-log-analyzer/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How ln-514-test-log-analyzer Compares

Feature / Agentln-514-test-log-analyzerStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Analyzes application logs: classifies errors, checks log quality, maps stack traces to source. Use when logs need review after test runs or during development.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

> **Paths:** File paths (`shared/`, `references/`, `../ln-*`) are relative to skills repo root. If not found at CWD, locate this SKILL.md directory and go up one level for repo root. If `shared/` is missing, fetch files via WebFetch from `https://raw.githubusercontent.com/levnikolaevich/claude-code-skills/master/skills/{path}`.

# Test Log Analyzer

**Type:** L3 Worker
**Category:** 5XX Quality

Two-layer analysis of application logs. Node.js script handles collection and quantitative analysis; AI handles classification, quality assessment, and fix recommendations.

## Inputs

No required inputs. Runs in current project directory, auto-detects log sources.

Optional `args` — caller instructions (natural language): time window, expected errors, test context. Example: `"review logs for last 30min, auth 401 errors expected from negative tests"`.

## Purpose & Scope
- Analyze application logs (after test runs, during development, or on demand)
- Classify errors into 4 categories: Real Bug, Test Artifact, Expected Behavior, Operational Warning
- Assess log quality: noisiness, completeness, level correctness, format, structured logging
- Map stack traces to source files; provide fix recommendations
- Report findings for quality verdict (only Real Bugs block)
- **No status changes or task creation** — report only

## When to Use
- Analyze application logs in any project (default: last 1h)
- After test runs to classify errors and assess log quality
- Can be invoked with context instructions: `Skill(skill: "ln-514-test-log-analyzer", args: "review last 30min, 401 errors expected")`

## Workflow

### Phase 0: Parse Instructions

If `args` provided — extract: time window (default: 1h), expected errors list, test context.
If no `args` — use defaults (last 1h, no expected errors).

### Phase 1: Log Source Detection and Script Execution

Read target project files if they exist: `docs/project/infrastructure.md`, `docs/project/runbook.md`

1) Check if `scripts/analyze_test_logs.mjs` exists in target project. If missing, copy from `references/analyze_test_logs.mjs`.
2) Detect log source mode (auto-detection priority: docker → file → loki):

| Mode | Detection | Source |
|------|-----------|--------|
| `docker` | `docker compose ps` returns running containers | `docker compose logs --since {window}` |
| `file` | `.log` files exist, or `tests/manual/results/` has output | File paths from infrastructure.md or `*.log` glob |
| `loki` | `LOKI_URL` env var or `tools_config.md` observability section | Loki HTTP query_range API |

3) Run script: `node scripts/analyze_test_logs.mjs --mode {detected} [options]`
4) If no log sources found → return `NO_LOG_SOURCES` status, skip to Phase 5.

**Level-based error detection (CRITICAL):**
When constructing Loki queries or grep commands to scan for errors, ALWAYS filter by the **parsed level field**, NOT by text matching the word "error" in the full log line. Logger names like `uvicorn.error` contain "error" as part of the name but log at INFO level — text matching produces false positives.

| Log format | Correct error filter | Wrong filter |
|------------|---------------------|--------------|
| Pipe-delimited (`ts \| LEVEL \| ...`) | `\| ERROR` or `\| CRITICAL` (match level field position) | `grep -i error` (matches logger names) |
| Loki structured | `{service_name="X"} \| level="ERROR"` or `\| pattern` extraction | `{service_name="X"} \|= "error"` |
| Key=value (`level=ERROR msg=...`) | `level=ERROR` or `level=FATAL` | `\|~ "(?i)error"` |
| Docker logs (local) | `grep -E '\| ERROR \| \| CRITICAL '` | `grep -iE 'error\|exception'` |

The `analyze_test_logs.mjs` script handles this correctly via structured regex parsers. These rules apply to **ad-hoc Loki/grep queries** constructed during analysis.

### Phase 2: 4-Category Error Classification

Classify each error group from script JSON output:

| Category | Action | Criteria |
|----------|--------|----------|
| **Real Bug** | Fix | Unexpected crash, data loss, broken pipeline |
| **Test Artifact** | Skip | From test scripts, deliberate error-path validation |
| **Expected Behavior** | Skip | Rate limiting, input validation, auth failures from invalid tokens |
| **Operational Warning** | Monitor | Clock drift, resource pressure, temporary unavailability |

**Test artifact detection heuristics:**
- Test name contains: `invalid`, `error`, `fail`, `reject`, `unauthorized`, `forbidden`, `not_found`, `bad_request`, `timeout`
- Test asserts non-2xx status codes (4xx, 5xx)
- Test uses `pytest.raises`, `expect(...).rejects`, `assertThrows`, `should.throw`
- Errors correlate with test execution timestamps from regression test output
- Patterns matching `tests/manual/` scripts

**Error taxonomy per** `references/error_taxonomy.md` **(9 categories: CRASH, TIMEOUT, AUTH, DB, NETWORK, VALIDATION, CONFIG, RESOURCE, DEPRECATION).**

### Phase 3: Log Quality Assessment

**MANDATORY READ:** Load `references/error_taxonomy.md` (per-level criteria table + level correctness reference)

**Step 1: Detect configured log level.** Check in order:
1. `LOG_LEVEL` / `LOGLEVEL` env var (`.env`, `docker-compose.yml`, `infrastructure.md`)
2. Framework config: Python `logging.conf` / Django `LOGGING` / Node `LOG_LEVEL`
3. Default: assume `INFO` if not detected

Configured level determines WHICH levels appear in logs, but each level has its own noise threshold regardless.

**Step 2: Assess 6 quality dimensions:**

| Dimension | What to Check | Signal |
|-----------|---------------|--------|
| **Noisiness** | Per-level noise thresholds from `error_taxonomy.md` section 4: TRACE (zero in prod), DEBUG (>50% monopoly), INFO (>30%), WARNING (>1% of total), ERROR (>0.1% of total) | `NOISY: {level} template "{msg}" at {ratio}%` |
| **Completeness & Traceability** | Critical operations missing log entries + traceability gaps (see table below) | `MISSING: No log for {operation}` / `TRACEABILITY_GAP: {type} in {file}:{line}` |
| **Level correctness** | Per-level criteria from `error_taxonomy.md` section 4: content, anti-patterns, library rule | `WRONG_LEVEL: should be {level}` |
| **Structured logging** | Missing trace_id/request_id/user context; unstructured plaintext | `UNSTRUCTURED: lacks {field}` |
| **Sensitivity** | PII/secrets/tokens/passwords in log messages | `SENSITIVE: {type} exposure` |
| **Context richness** | Errors without actionable context (order_id, user_id, operation) | `LOW_CONTEXT: lacks context` |

**Traceability gap detection** — scan source code for operations without INFO-level logging:

| Operation Type | Expected Log | Where to Add |
|---------------|-------------|--------------|
| Incoming request handling | Request received + response status | Entry/exit of route handler |
| External API call | Request sent + response status + duration | Before/after HTTP client call |
| DB write (INSERT/UPDATE/DELETE) | Operation + affected entity + count | Before/after ORM/query call |
| Auth decision | Result (allow/deny) + reason | After auth check |
| State transition | Old state → new state + trigger | At transition point |
| Background job | Start + complete/fail + duration | Entry/exit of job handler |
| File/resource operation | Open/close + path + size | At I/O operation |

**Log Format Quality** (10-criterion checklist per `references/log_analysis_output_format.md`):

| # | Criterion | Check |
|---|-----------|-------|
| 1 | Dual format | JSON in prod, readable in dev |
| 2 | Timestamp | Consistent, timezone-aware |
| 3 | Level field | Present, uppercase |
| 4 | Trace/Correlation ID | Present in every entry, async-safe |
| 5 | Service name | Identifies source service |
| 6 | Source location | module:line + function |
| 7 | Extra context | Structured fields, not string interpolation |
| 8 | PII redaction | Passwords, API keys, emails handled |
| 9 | Noise suppression | Duplicate filters, third-party suppressed |
| 10 | Parseability | Dev: pipe-delimited; prod: valid JSON per line |

Score: passed criteria / 10.

### Phase 4: Stack Trace Mapping + Fix Recommendations

For each Real Bug:
1) Extract stack trace frames; identify origin frame (first frame in project code, not in node_modules/site-packages)
2) Map to source file:line
3) Generate fix recommendation: what to change, where, effort estimate (S/M/L)

**Prioritize using Sentry-inspired dimensions:**
- High-volume (occurrence count), Post-test regression (new errors), High-impact path (auth/payment/DB), Correlated traces (trace_id across services)

### Phase 5: Generate Report

**MANDATORY READ:** Load `references/log_analysis_output_format.md`

Output report to chat with header `## Test Log Analysis`. Include:
- Signals table (Real Bugs count, Test Artifacts filtered, Log Noise status, Log Format score, Log Quality score)
- Real Bugs table (priority, category, error, source, fix recommendation)
- Filtered table (category, count, examples)
- Log Quality Issues table (dimension, service, issue, recommendation)
- Noise Report table (count, ratio, service, level, template, action)
- Machine-readable block `<!-- LOG-ANALYSIS-DATA ... -->` for programmatic consumption

### Phase 6: Meta-Analysis

**MANDATORY READ:** Load `shared/references/meta_analysis_protocol.md`

Skill type: `execution-worker`. Run after all phases complete.

## Verdict Contribution

Quality coordinator normalization matrix component:

| Status | Maps To | Penalty |
|--------|---------|---------|
| CLEAN | -- | 0 |
| WARNINGS_ONLY | -- | 0 |
| REAL_BUGS_FOUND | FAIL | -20 |
| SKIPPED / NO_LOG_SOURCES | ignored | 0 |

Log quality/format issues are INFORMATIONAL — do not affect quality verdict. Only Real Bugs block.

## Critical Rules
- No status changes or task creation; report only.
- Test Artifacts and Expected Behavior are ALWAYS filtered — never count as bugs.
- Log quality issues are advisory — inform, don't block.
- Script must handle gracefully: no Docker, no log files, no Loki → `NO_LOG_SOURCES`.
- Language preservation in comments (EN/RU).

## Runtime Summary Artifact

**MANDATORY READ:** Load `shared/references/quality_summary_contract.md`

Accept optional `summaryArtifactPath`.

Summary kind:
- `quality-worker`

Required payload semantics:
- `worker = "ln-514"`
- `status`
- `verdict`
- `issues`
- `warnings`
- `artifact_path`

Write the summary to the provided artifact path or return the same envelope in structured output.

## Definition of Done

- [ ] Script deployed to target project `scripts/` (or already exists)
- [ ] Log source detected and script executed (or NO_LOG_SOURCES returned)
- [ ] Errors classified into 4 categories; Real Bugs identified
- [ ] Log quality assessed (6 dimensions + 10-criterion format checklist)
- [ ] Stack traces mapped to source files for Real Bugs
- [ ] Report output to chat with signals table + machine-readable block

## Reference Files
- **Error taxonomy:** `references/error_taxonomy.md`
- **Output format:** `references/log_analysis_output_format.md`
- **Analysis script:** `references/analyze_test_logs.mjs`

---
**Version:** 1.0.0
**Last Updated:** 2026-03-13

Related Skills

ln-782-test-runner

310
from levnikolaevich/claude-code-skills

Executes all test suites and reports results with coverage. Use when verifying that test infrastructure works after bootstrap.

ln-743-test-infrastructure

310
from levnikolaevich/claude-code-skills

Sets up test infrastructure with Vitest, xUnit, and pytest. Use when adding testing frameworks and sample tests to a project.

ln-641-pattern-analyzer

310
from levnikolaevich/claude-code-skills

Analyzes single pattern implementation, calculates compliance/completeness/quality scores, identifies gaps. Use when auditing a specific pattern.

ln-637-test-structure-auditor

310
from levnikolaevich/claude-code-skills

Checks test file organization, directory layout, test-to-source mapping, domain grouping, co-location. Use when auditing test structure.

ln-636-manual-test-auditor

310
from levnikolaevich/claude-code-skills

Checks manual test scripts for harness adoption, golden files, fail-fast, config sourcing, idempotency. Use when auditing manual test quality.

ln-635-test-isolation-auditor

310
from levnikolaevich/claude-code-skills

Checks test isolation (API/DB/FS/Time/Network), determinism, flaky tests, order-dependency, anti-patterns. Use when auditing test isolation.

ln-634-test-coverage-auditor

310
from levnikolaevich/claude-code-skills

Identifies missing tests for critical paths (money, security, data integrity, core flows). Use when auditing test coverage gaps.

ln-633-test-value-auditor

310
from levnikolaevich/claude-code-skills

Scores each test by Impact x Probability, returns KEEP/REVIEW/REMOVE decisions. Use when auditing test value and pruning low-value tests.

ln-632-test-e2e-priority-auditor

310
from levnikolaevich/claude-code-skills

Validates E2E coverage for critical paths (money, security, data integrity). Risk-based prioritization. Use when auditing E2E test coverage.

ln-631-test-business-logic-auditor

310
from levnikolaevich/claude-code-skills

Detects tests validating framework/library behavior instead of project code. Use when auditing test business logic focus.

ln-630-test-auditor

310
from levnikolaevich/claude-code-skills

Coordinates test suite audit across business logic, E2E coverage, value, isolation, manual quality, and structure. Use when auditing entire test suite.

ln-523-auto-test-planner

310
from levnikolaevich/claude-code-skills

Plans automated tests (E2E/Integration/Unit) using Risk-Based Testing after manual testing. Use when Story needs a test task with prioritized scenarios.