research-reporting

Create structured research notes from experiment runs and analysis artifacts. Use when creating a note at run launch, updating it as training/evaluation/loss stages finish, summarizing a finished run, comparing experiment outcomes, extracting hypotheses from eval/loss artifacts, or proposing next-run actions grounded in `.tracking/experiments/<id>/analysis/` outputs. This skill is about turning repo-native experiment evidence into stable, machine-readable markdown.

6 stars

byProfSynapse

View on GitHub Installation ↓

Best use case

research-reporting is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using research-reporting should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/research-reporting/SKILL.md --create-dirs "https://raw.githubusercontent.com/ProfSynapse/Synaptic-Tuner/main/.agents/skills/research-reporting/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/research-reporting/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How research-reporting Compares

Feature / Agent	research-reporting	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Research Reporting

Generate compact research notes that are easy to read and easy to parse later.

## Use This Skill When

- The user wants a research note, experiment summary, post-run analysis, or structured markdown output.
- The source of truth is an experiment bundle under `.tracking/experiments/<id>/`.
- The output should include stable frontmatter and explicit evidence for claims.
- The note should be created early and updated through the lifecycle of one experiment.

## Default Workflow

1. Resolve the experiment id and open `.tracking/experiments/<id>/experiment.json`.
2. If `spec_path` is present, read the experiment spec so the note captures actual config numbers instead of only outcome artifacts.
3. Read primary analysis artifacts in this order:
- `analysis/experiment_summary.json`
- `analysis/next_run_candidates.json`
- `analysis/hypothesis_context.json`
- `analysis/run_matrix.csv`
4. Read failure slices only if you need representative examples:
- `analysis/failure_slices/eval_failures.jsonl`
- `analysis/failure_slices/high_loss_examples.jsonl`
5. Read stage lineage files when you need provenance, timing, commit, hardware, or cost details.
6. Write the note from `assets/research_note_template.md`.

Load [reference/artifact-map.md](reference/artifact-map.md) when you need to know which artifact supports which section.

## Lifecycle Modes

Use the same note template for all three modes:

1. Launch note:
- Create the note as soon as the experiment is launched or selected.
- Fill identity, config, and known runtime fields.
- Leave future metrics and recommendation fields empty.
2. Stage update:
- Re-open the same note after training, evaluation, loss, or analysis completes.
- Update only the fields now supported by artifacts.
- Preserve prior fields unless newer canonical artifacts supersede them.
3. Final note:
- After analysis/recommendation, ensure the note contains the final status, observed outcomes, hypotheses, and next-run recommendation.

Default policy: one evolving note per experiment, not one note per stage.

## Reporting Rules

- Keep frontmatter keys stable across notes. Prefer `null`, `[]`, or omitted sections over ad hoc placeholder prose.
- Keep the schema general: use grouped maps for metrics and config instead of adding one top-level key per experiment-specific number.
- Support partial completion. A note does not need all stages populated to be valid.
- Separate three things clearly:
- observed: directly supported by artifacts
- inferred: reasoned from observed evidence
- proposed: next-run actions or hypotheses
- Prefer exact values from JSON for frontmatter. Round only in prose if readability improves.
- Preserve config numbers exactly as found in the spec or lineage artifacts. Do not normalize `1.0e-4` into prose-only text and do not drop unset knobs that matter to interpretation.
- Do not cite `experiment_summary.md` as the primary evidence source when the JSON exists.
- Do not invent comparisons, baselines, or causes. If a baseline run is missing, state that it is missing.
- When the analysis bundle includes a ranked recommendation, carry over:
- `selected_candidate_rank`
- `selected_candidate_confidence`
- `recommended_next_action`
- If loss artifacts are absent or failed, keep loss fields `null` and note the missing stage in the body.
- If a run has custom metrics, place them under `metrics.summary` or `metrics.groups.<stage>` rather than forcing them into a fixed eval/loss schema.
- If a run has stage-specific knobs, place them under `config_snapshot.<stage>` rather than flattening them into root keys.
- When updating an existing note, overwrite fields only when the new source is more canonical or more complete than the prior one.
- For in-flight runs, prefer explicit `stage_statuses` and partial sections over vague prose like "still running."

## Note Shape

Use the template exactly once per note and keep these sections in this order:

1. `Summary`
2. `Run Context`
3. `Observed Results`
4. `Failure Analysis`
5. `Hypotheses`
6. `Next Run`
7. `Sources`

The frontmatter is for machine-readable indexing. The body is for human judgment and downstream review.

## Interpretation Heuristics

- Treat `experiment_summary.json` as the canonical top-level snapshot.
- Treat `experiment.json` plus the referenced experiment spec as the canonical source for config intent.
- Treat `next_run_candidates.json` as the canonical source for ranked recommendations and high-loss snapshot summaries.
- Use `hypothesis_context.json` when you need richer tag-level evidence or supporting context behind a recommendation.
- Use `run_matrix.csv` to confirm stage status rather than inferring completion from one artifact alone.
- If schema pass rate is materially higher than behavior pass rate, call out behavior reliability as the likely bottleneck instead of just "tool calling."

## Output Discipline

- Use short paragraphs and flat bullets.
- Name concrete failure families or tags when the artifacts support them.
- Include exact artifact paths in `sources`.
- If the user asks for a comparison note, keep the same template and populate `comparison_experiment_ids`.

## Bundled Resources

- Template: [assets/research_note_template.md](assets/research_note_template.md)
- Artifact guide: [reference/artifact-map.md](reference/artifact-map.md)

Related Skills

upload-deployment

from ProfSynapse/Synaptic-Tuner

Complete reference for model upload and deployment. Covers HuggingFace upload, save strategies (LoRA, merged 16-bit, merged 4-bit), GGUF conversion, model merging, model cards, and the full upload workflow. Use when uploading models, creating GGUF files, merging LoRA adapters, or deploying to HuggingFace. This skill is about USING the upload/deployment tools via CLI — never modifying source code.

synthetic-data-generation

from ProfSynapse/Synaptic-Tuner

Complete reference for the SynthChat synthetic dataset generation system. Covers CLI commands (generate, improve, validate), scenario YAML authoring, rubric YAML authoring, settings configuration, evaluation, and full workflow. Use when generating datasets, writing rubrics/scenarios, configuring models/workers, improving dataset quality, or running evaluations. This skill is about USING the system via CLI and YAML — never modifying source code.

fine-tuning

from ProfSynapse/Synaptic-Tuner

Complete reference for the fine-tuning pipeline (SFT, KTO, GRPO), cloud HF Jobs workflows, autonomous experiment search, checkpoint evaluation, and LoRA surgery. Covers training CLI flags, YAML configuration, model presets, dataset requirements, LoRA settings, training monitoring, hyperparameter search, and post-training optimization. Use when training models, configuring training runs, choosing hyperparameters, running cloud experiments, inspecting HF jobs, or troubleshooting training issues. This skill is about USING the training system via CLI and YAML — never modifying source code.

evaluation

from ProfSynapse/Synaptic-Tuner

Complete reference for the config-first model evaluation system. Covers the Evaluator CLI, assertion-driven YAML scenarios, response views, backend configuration, presets, scoring, LLM-as-judge, model comparison, and HuggingFace integration. Use when evaluating models, writing test prompts, comparing training runs, or interpreting eval results. This skill is about USING the evaluation system via CLI and YAML.

dataset-publishing

from ProfSynapse/Synaptic-Tuner

Publish local dataset artifacts to a Hugging Face dataset repo. Use when uploading a JSONL dataset, pushing a filtered dataset variant, syncing a matching .metadata.json sidecar, or renaming a dataset file in the target repo. This skill is about USING the checked-in dataset publish script via CLI — never ad hoc Python.

case-studies

from ProfSynapse/Synaptic-Tuner

End-to-end case studies showing how to implement the full training pipeline for different skill types. Covers three complete worked examples — tool-calling training, essay-style training, and agentic search (RAG agent) training — demonstrating dataset design, synthetic generation, validation, fine-tuning, evaluation, and iteration. Use when onboarding to the project, understanding how all components fit together, explaining the pipeline to others, or planning a new training capability. This skill is about UNDERSTANDING the system holistically — reference the other skills for specific CLI commands.

market-research

144923

from affaan-m/everything-claude-code

Conduct market research, competitive analysis, investor due diligence, and industry intelligence with source attribution and decision-oriented summaries. Use when the user wants market sizing, competitor comparisons, fund research, technology scans, or research that informs business decisions.

Business Intelligence & AnalyticsClaude

deep-research

144923

from affaan-m/everything-claude-code

Multi-source deep research using firecrawl and exa MCPs. Searches the web, synthesizes findings, and delivers cited reports with source attribution. Use when the user wants thorough research on any topic with evidence and citations.

ResearchClaude

deep-research

31392

from sickn33/antigravity-awesome-skills

Run autonomous research tasks that plan, search, read, and synthesize information into comprehensive reports.

ResearchClaudeGemini

context7-auto-research

31392

from sickn33/antigravity-awesome-skills

Automatically fetch latest library/framework documentation for Claude Code via Context7 API. Use when you need up-to-date documentation for libraries and frameworks or asking about React, Next.js, Prisma, or any other popular library.

Developer ToolsClaude

ESG & Sustainability Reporting Framework

3891

from openclaw/skills

You are an ESG reporting specialist. Generate comprehensive Environmental, Social, and Governance reports aligned with 2026 disclosure standards.

Workflow & Productivity

Board Reporting Framework

3891

from openclaw/skills

Generate investor-ready board decks and reporting packages. Covers monthly board updates, quarterly deep dives, and annual reviews with the metrics that actually matter.

Workflow & Productivity