autoresearch-genealogy

Structured prompts, vault templates, and autonomous research workflows for AI-assisted genealogy using Claude Code.

3,817 stars

Best use case

autoresearch-genealogy is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Structured prompts, vault templates, and autonomous research workflows for AI-assisted genealogy using Claude Code.

Teams using autoresearch-genealogy should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/autoresearch-genealogy/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/adisinghstudent/autoresearch-genealogy/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/autoresearch-genealogy/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How autoresearch-genealogy Compares

Feature / Agentautoresearch-genealogyStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Structured prompts, vault templates, and autonomous research workflows for AI-assisted genealogy using Claude Code.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# autoresearch-genealogy

> Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection.

A structured system of autoresearch prompts, Obsidian vault templates, archive guides, and methodology references for AI-assisted genealogy research. Built for Claude Code's autonomous research loops, adaptable to any AI tool or manual workflow.

---

## What This Project Does

- Provides 12 Claude Code `/autoresearch` prompts that autonomously search the web, update your vault, and self-verify results
- Supplies a complete 19-file Obsidian vault starter kit with YAML frontmatter and markdown templates
- Includes 24 country/region-specific archive guides (Europe, Americas, Oceania, Jewish genealogy)
- Offers 9 methodology reference documents covering confidence tiers, DNA guardrails, naming conventions, and source hierarchy
- Defines 7 step-by-step workflows for OCR pipelines, oral history, discrepancy resolution, and phase planning

---

## Installation

```bash
# Clone the repository
git clone https://github.com/mattprusak/autoresearch-genealogy.git
cd autoresearch-genealogy

# Copy vault template into your Obsidian vault
cp -r vault-template/ ~/path/to/your/ObsidianVault/genealogy/

# Or copy to any markdown editor folder
cp -r vault-template/ ~/Documents/my-genealogy/
```

No package manager or build step required — this is a pure markdown/prompt project.

---

## Project Structure

```
autoresearch-genealogy/
├── prompts/              # 12 autoresearch prompt files for Claude Code
├── vault-template/       # 19-file Obsidian vault starter kit
│   ├── Family_Tree.md
│   ├── Research_Log.md
│   ├── Open_Questions.md
│   ├── templates/        # Person, certificate, postcard, region, etc.
│   └── ...
├── archives/             # 24 country/region research guides
├── reference/            # 9 methodology documents
├── workflows/            # 7 step-by-step process guides
└── examples/             # 6 anonymized worked examples
```

---

## Quick Start Workflow

### Step 1: Seed your family tree

Open `vault-template/Family_Tree.md` and fill in what you already know, starting with yourself and working backward:

```markdown
---
title: Family Tree
last_updated: 2026-03-19
generations_documented: 3
lines_active: 2
---

# Family Tree

## Generation 1 (Self)
- **Name**: Jane Smith (b. 1985, Chicago, IL)

## Generation 2 (Parents)
- **Father**: John Smith (b. 1955, Detroit, MI)
- **Mother**: Mary O'Brien (b. 1958, Boston, MA)

## Generation 3 (Grandparents)
- **Paternal Grandfather**: Robert Smith (b. ~1920, unknown)
- **Paternal Grandmother**: Helen Kowalski (b. ~1925, Poland?)
```

### Step 2: Scan physical documents

Photograph or scan certificates, letters, postcards. Use the OCR workflow:

```
See: workflows/ocr-pipeline.md
```

### Step 3: Run autoresearch prompts in Claude Code

```
/autoresearch prompts/01-tree-expansion.md
```

### Step 4: Audit and verify

```
/autoresearch prompts/02-cross-reference-audit.md
```

---

## Autoresearch Prompts — Reference

Each prompt in `prompts/` follows this structure:

```markdown
## Goal
[What this iteration should accomplish]

## Metric
[Measurable success condition — e.g., "increase sourced person files from N to N+10"]

## Direction
[Step-by-step instructions for the AI]

## Verify
[Cross-check to run after each iteration]

## Guard Rails
[What NOT to do — prevent hallucination, preserve source rigor]

## Iterations
[How many loops to run before stopping for human review]

## Protocol
[Output format, file naming, YAML fields to populate]
```

### All 12 Prompts

| File | Purpose |
|------|---------|
| `01-tree-expansion.md` | Push every branch back using web research |
| `02-cross-reference-audit.md` | Find and fix discrepancies between tree and sources |
| `03-findagrave-sweep.md` | Locate Find a Grave memorials for deceased ancestors |
| `04-gedcom-completeness.md` | Sync GEDCOM file with vault data |
| `05-source-citation-audit.md` | Verify every person has ≥2 independent sources |
| `06-unresolved-persons.md` | Identify and resolve unnamed people in documents |
| `07-timeline-gap-analysis.md` | Find life events where records should exist but don't |
| `08-open-question-resolution.md` | Systematically attack every open research question |
| `09-bygdebok-extraction.md` | Extract data from digitized local history books |
| `10-colonial-records-search.md` | Search pre-1800 colonial American records |
| `11-immigration-search.md` | Locate passenger manifests and naturalization records |
| `12-dna-chromosome-analysis.md` | Analyze per-chromosome ancestry data |

### Running a prompt in Claude Code

```bash
# In Claude Code terminal or chat:
/autoresearch prompts/08-open-question-resolution.md

# With a specific vault path context:
/autoresearch prompts/03-findagrave-sweep.md --context vault-template/Family_Tree.md
```

---

## Vault Template Files

### Person file template (`vault-template/templates/person.md`)

```markdown
---
full_name: ""
birth_date: ""
birth_place: ""
death_date: ""
death_place: ""
father: ""
mother: ""
spouse: ""
children: []
confidence: "Moderate Signal"  # Strong Signal | Moderate Signal | Speculative
sources: []
open_questions: []
last_updated: ""
---

# [Full Name]

## Life Events

| Event | Date | Place | Source |
|-------|------|-------|--------|
| Birth | | | |
| Marriage | | | |
| Death | | | |

## Sources

1. [Source 1 — type, repository, date accessed]
2. [Source 2 — type, repository, date accessed]

## Open Questions

- [ ] Question 1
- [ ] Question 2

## Notes

[Narrative summary, naming variant notes, contextual history]
```

### Certificate transcription template (`vault-template/templates/certificate.md`)

```markdown
---
document_type: ""        # birth | death | marriage | baptism
document_date: ""
repository: ""
file_reference: ""
transcribed_by: ""
transcription_date: ""
confidence: ""
---

# Certificate: [Type] — [Name] — [Year]

## Transcription

[Verbatim transcription of the document]

## Key Data Extracted

- **Subject**: 
- **Date**: 
- **Place**: 
- **Witnesses/Informants**: 
- **Officiant**: 

## Discrepancies

[Note any conflicts with other sources]

## Image

![[filename.jpg]]
```

### Research log entry pattern (`vault-template/Research_Log.md`)

```markdown
## 2026-03-19 — Tree Expansion Session

**Prompt run**: 01-tree-expansion.md  
**Iterations**: 5  
**Metric start**: 42 sourced person files  
**Metric end**: 51 sourced person files  

### Searches Performed
- FamilySearch: "Kowalski Poznan 1880–1920" — 3 results, 2 useful
- Ancestry: "Smith Michigan census 1920" — found Robert Smith (b. 1919)
- FindAGrave: "Helen Kowalski Detroit" — memorial #12345678

### Negative Results (Important)
- No passenger manifest found for Stanislaw Kowalski, searched 1890–1910
- No church records found for O'Brien line in Cork pre-1850

### New Open Questions
- [ ] Was Robert Smith born in Michigan or Ohio? 1920 census says MI, 1930 says OH.
```

---

## Confidence Tier System

From `reference/confidence-tiers.md`:

```
Strong Signal    — Two or more independent primary sources agree
Moderate Signal  — One primary source, or two secondary sources agree
Speculative      — Logical inference, DNA suggestion, or single secondary source
```

Apply confidence in every person file YAML:

```markdown
---
confidence: "Moderate Signal"
---
```

---

## Archive Guides — Key Countries

Each guide in `archives/` covers:
- Where to find records (free vs paid)
- What AI tools can access directly vs what requires browser
- Record types available by era

```
archives/
├── ireland.md
├── england-wales.md
├── scotland.md
├── norway.md
├── sweden.md
├── poland.md
├── germany.md
├── italy.md
├── france.md
├── spain-portugal.md
├── netherlands.md
├── austria.md
├── hungary.md
├── russia-ukraine.md
├── usa-colonial.md
├── usa-immigration.md
├── usa-census.md
├── usa-vital-records.md
├── african-american.md
├── canada.md
├── mexico-latin-america.md
├── australia-new-zealand.md
├── jewish-genealogy.md
└── ...
```

Example usage in a prompt:

```markdown
# In prompts/09-bygdebok-extraction.md
## Direction
Consult archives/norway.md for Digitalarkivet access patterns.
Search Bygdebok collections for the Rogaland region, 1750–1900.
```

---

## Common Patterns

### Pattern 1: New ancestor intake

When a new ancestor is found during research:

```markdown
1. Create person file from vault-template/templates/person.md
2. Set confidence based on source count
3. Add to Family_Tree.md under correct generation
4. Log the discovery in Research_Log.md
5. Add unresolved questions to Open_Questions.md
6. Run 02-cross-reference-audit.md to check for conflicts
```

### Pattern 2: Resolving a date discrepancy

```markdown
# Open_Questions.md entry
## Q-042: Robert Smith birth state conflict
- 1920 census: born Michigan
- 1930 census: born Ohio
- Status: Unresolved
- Next step: Run 07-timeline-gap-analysis.md targeting Robert Smith
```

Then in Claude Code:

```
/autoresearch prompts/07-timeline-gap-analysis.md
# Focus: Robert Smith, b. ~1919, discrepancy Q-042
```

### Pattern 3: DNA-to-genealogy mapping

```markdown
# In vault-template/Genetic_Profile.md
---
test_company: AncestryDNA
test_date: 2024-11-01
ethnicity_summary:
  - region: Eastern Europe
    percentage: 38
  - region: Ireland/Scotland
    percentage: 31
---

# Then run:
/autoresearch prompts/12-dna-chromosome-analysis.md
```

### Pattern 4: Immigration research loop

```bash
# Run immigration search prompt
/autoresearch prompts/11-immigration-search.md

# Prompt will:
# 1. Pull all foreign-born ancestors from Family_Tree.md
# 2. Search passenger manifests (Ellis Island, Ancestry, FamilySearch)
# 3. Search naturalization records (NARA, Ancestry)
# 4. Update person files with ship name, arrival date, port
# 5. Log negative results for each unresolved ancestor
```

---

## Reference Documents

| File | Contents |
|------|---------|
| `reference/confidence-tiers.md` | Strong / Moderate / Speculative definitions |
| `reference/source-hierarchy.md` | Primary vs secondary vs derivative sources |
| `reference/dna-guardrails.md` | What DNA can and cannot prove; centimorgan thresholds |
| `reference/naming-conventions.md` | Patronymics, farm names, Polish przydomki |
| `reference/gedcom-guide.md` | GEDCOM field reference and export instructions |
| `reference/common-pitfalls.md` | AI hallucination patterns in genealogy, date traps |
| `reference/glossary.md` | Record type definitions, Latin terms, abbreviations |
| `reference/ai-capabilities.md` | What AI can access directly vs what requires human |
| `reference/case-for-autoresearch.md` | Methodology rationale |

---

## Troubleshooting

### AI is inventing sources

Set guard rails explicitly in your prompt session:

```markdown
## Guard Rails (add to any prompt)
- Do NOT fabricate census record URLs or Ancestry record IDs
- If a source cannot be directly linked, mark as "reported" not "confirmed"
- All new claims require a real URL or repository reference
- When uncertain, add to Open_Questions.md — do not guess
```

### Vault files getting out of sync with GEDCOM

Run the completeness audit:

```
/autoresearch prompts/04-gedcom-completeness.md
```

This compares every person in your GEDCOM against vault person files and flags mismatches.

### Name variants causing duplicate person files

Check `reference/naming-conventions.md` for your family's relevant region. Common traps:

- Norwegian farm name changes (Haugen → Bakke on emigration)
- Polish name Latinization in church records (Stanisław → Stanislaus)
- Irish anglicization (Ó Briain → O'Brien → Bryan)
- Spelling variation in census records ("Sakkarias" vs "Zacharias" — both valid)

Add aliases to person file YAML:

```markdown
---
full_name: "Stanisław Kowalski"
name_variants:
  - "Stanislaus Kowalski"
  - "Stanley Kowalski"
  - "S. Kowalski"
---
```

### Autoresearch loop running too long

Each prompt has an `## Iterations` field. Set it explicitly:

```markdown
## Iterations
Run 3 iterations maximum, then stop and output a summary for human review.
```

### OCR producing poor results on old documents

See `workflows/ocr-pipeline.md`. General guidance:

1. Photograph at 600 DPI minimum
2. Use even, diffuse lighting — no flash
3. Pre-process with a contrast adjustment before running OCR
4. Use `vault-template/templates/transcription.md` to record both the OCR output and your manual corrections side by side

---

## Contributing

To add a new archive guide or prompt:

1. Follow the existing file structure and YAML frontmatter patterns
2. Use placeholder names in all examples (no real family data)
3. Open a PR with a brief description of what region or record type you've added

License: MIT

Related Skills

autoresearch-pro

3891
from openclaw/skills

Automatically improve OpenClaw skills, prompts, or articles through iterative mutation-testing loops. Inspired by Karpathy's autoresearch. Use when user says 'optimize [skill]', 'autoresearch [skill]', 'improve my skill', 'optimize this prompt', 'improve my prompt', 'polish this article', 'improve this article', or explicitly requests quality improvement for any text-based content. Supports three modes: skill (SKILL.md files), prompt (any prompt text), and article (any document).

Workflow & Productivity

Autoresearch Skill

3891
from openclaw/skills

## Trigger

autoresearch-agent

3891
from openclaw/skills

Autonomous experiment loop that optimizes any file by a measurable metric. Inspired by Karpathy's autoresearch. The agent edits a target file, runs a fixed evaluation, keeps improvements (git commit), discards failures (git reset), and loops indefinitely. Use when: user wants to optimize code speed, reduce bundle/image size, improve test pass rate, optimize prompts, improve content quality (headlines, copy, CTR), or run any measurable improvement loop. Requires: a target file, an evaluation command that outputs a metric, and a git repo.

autoresearch

3891
from openclaw/skills

Autonomous AI research skill for running automated neural network experiments. This skill should be used when the user wants to set up autonomous AI research experiments, run automated neural network training, conduct autonomous machine learning research, or let AI agents experiment with model architectures and hyperparameters. Based on Andrej Karpathy's autoresearch project, this skill enables AI agents to autonomously modify training code, run experiments, evaluate results, and iteratively improve models. Use when: (1) Setting up autonomous research experiments, (2) Running automated neural network training, (3) Conducting AI-driven research optimization, (4) Experimenting with model architectures and hyperparameters, (5) Implementing autonomous research loops, or (6) When the user mentions "autonomous research", "AI experiments", "automated training", "neural network optimization", or "autoresearch".

codex-autoresearch-loop

3819
from openclaw/skills

Self-directed iterative research skill for Codex that continuously cycles through modify, verify, retain or discard, and repeat until a measurable goal is reached.

autoresearchclaw-autonomous-research

3817
from openclaw/skills

Fully autonomous research pipeline that turns a topic idea into a complete academic paper with real citations, experiments, and conference-ready LaTeX.

---

3891
from openclaw/skills

name: article-factory-wechat

Content & Documentation

humanizer

3891
from openclaw/skills

Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, negative parallelisms, and excessive conjunctive phrases.

Content & Documentation

find-skills

3891
from openclaw/skills

Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.

General Utilities

tavily-search

3891
from openclaw/skills

Use Tavily API for real-time web search and content extraction. Use when: user needs real-time web search results, research, or current information from the web. Requires Tavily API key.

Data & Research

baidu-search

3891
from openclaw/skills

Search the web using Baidu AI Search Engine (BDSE). Use for live information, documentation, or research topics.

Data & Research

agent-autonomy-kit

3891
from openclaw/skills

Stop waiting for prompts. Keep working.

Workflow & Productivity