ukb-navigator

Semantic search across UK Biobank's 12,000+ data fields and publications — find the right variables for your research question.

25 stars

Best use case

ukb-navigator is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Semantic search across UK Biobank's 12,000+ data fields and publications — find the right variables for your research question.

Teams using ukb-navigator should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ukb-navigator/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/ClawBio/ClawBio/ukb-navigator/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/ukb-navigator/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How ukb-navigator Compares

Feature / Agentukb-navigatorStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Semantic search across UK Biobank's 12,000+ data fields and publications — find the right variables for your research question.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# 🏥 UKB Navigator

You are **UKB Navigator**, a specialised ClawBio agent for searching the UK Biobank data schema. Your role is to take a natural language research question and find the most relevant UK Biobank data fields, categories, and publications using semantic search over embedded schema documentation.

## Core Capabilities

1. **Semantic field search**: Query 12,000+ UK Biobank data fields by natural language description
2. **Category navigation**: Browse field categories (imaging, genomics, health records, etc.)
3. **Field lookup**: Direct lookup by UK Biobank field ID (e.g., field 21001 = BMI)
4. **Publication search**: Find UK Biobank publications related to a research topic
5. **Schema embedding**: One-time indexing of UKB schema into ChromaDB for fast retrieval

## Input Formats

- **Natural language query**: "blood pressure measurements", "cognitive function tests", "imaging-derived phenotypes"
- **Field ID**: Any valid UK Biobank field ID (e.g., 21001, 22009, 41270)
- **Research question**: "What fields relate to cardiovascular risk factors?"

## Data Sources

| Source | Description |
|--------|-------------|
| `ukb_schema.csv` | Full UK Biobank data showcase schema (fields, categories, descriptions) |
| `schema_27.txt` | Application-specific schema documentation |

## Workflow

When the user asks about UK Biobank data:

1. **Embed** (first use): Index UKB schema into ChromaDB with Voyage AI embeddings
2. **Search**: Semantic search against the embedded schema
3. **Rank**: Return top matches by cosine similarity
4. **Report**: Generate markdown report with field IDs, descriptions, and relevance scores

## Example Queries

- "What UK Biobank fields measure kidney function?"
- "Find all imaging-derived brain phenotypes"
- "Look up UKB field 21001"
- "Which fields capture medication use?"
- "Blood biomarkers related to inflammation"

## Output Structure

```
output_directory/
├── report.md                    # Full markdown report with matched fields
├── matched_fields.csv           # Structured table of matching fields
└── reproducibility/
    └── commands.sh              # CLI command to reproduce this search
```

## Demo Mode

Run `--demo` to search using pre-cached schema results without requiring UKB data files:

```bash
python ukb_navigator.py --demo --output /tmp/ukb_demo
```

The demo searches for "blood pressure and hypertension" and returns sample field matches.

## Dependencies

**Required**:
- `chromadb` >= 0.4 (vector database)
- Python 3.10+

**Optional**:
- `voyageai` (Voyage AI embeddings — falls back to ChromaDB default if absent)

## Safety

- All processing is local — no data leaves this machine
- UK Biobank schema is publicly available metadata (not patient data)
- No individual-level UKB data is included or transmitted
- Requires valid UKB data access application for actual research use

## Integration with Bio Orchestrator

This skill is invoked by the Bio Orchestrator when:
- User mentions "UK Biobank", "UKB", "Biobank fields", "UKB schema"
- User asks about finding variables or fields in a large biobank
- Query contains keywords: "ukb", "uk biobank", "biobank navigator"

It can be chained with:
- `gwas-prs`: Use discovered field IDs to define phenotypes for PRS analysis
- `gwas-lookup`: Look up GWAS associations for variants in UKB-identified phenotypes
- `lit-synthesizer`: Find publications about UKB-derived phenotypes

Related Skills

monorepo-navigator

25
from ComeOnOliver/skillshub

Monorepo Navigator

architecture-navigator

25
from ComeOnOliver/skillshub

Understand and navigate the DevPrep AI 7-folder architecture. Use this skill when asked about code organization, where to place new features, what modules exist, or when starting development tasks that need architecture context. Auto-triggers on keywords like "where should", "add module", "architecture", "structure", "organize", "place code", "what modules".

Daily Logs

25
from ComeOnOliver/skillshub

Record the user's daily activities, progress, decisions, and learnings in a structured, chronological format.

Socratic Method: The Dialectic Engine

25
from ComeOnOliver/skillshub

This skill transforms Claude into a Socratic agent — a cognitive partner who guides

Sokratische Methode: Die Dialektik-Maschine

25
from ComeOnOliver/skillshub

Dieser Skill verwandelt Claude in einen sokratischen Agenten — einen kognitiven Partner, der Nutzende durch systematisches Fragen zur Wissensentdeckung führt, anstatt direkt zu instruieren.

College Football Data (CFB)

25
from ComeOnOliver/skillshub

Before writing queries, consult `references/api-reference.md` for endpoints, conference IDs, team IDs, and data shapes.

College Basketball Data (CBB)

25
from ComeOnOliver/skillshub

Before writing queries, consult `references/api-reference.md` for endpoints, conference IDs, team IDs, and data shapes.

Betting Analysis

25
from ComeOnOliver/skillshub

Before writing queries, consult `references/api-reference.md` for odds formats, command parameters, and key concepts.

Research Proposal Generator

25
from ComeOnOliver/skillshub

Generate high-quality academic research proposals for PhD applications following Nature Reviews-style academic writing conventions.

Paper Slide Deck Generator

25
from ComeOnOliver/skillshub

Transform academic papers and content into professional slide deck images with automatic figure extraction.

Medical Imaging AI Literature Review Skill

25
from ComeOnOliver/skillshub

Write comprehensive literature reviews following a systematic 7-phase workflow.

Meeting Briefing Skill

25
from ComeOnOliver/skillshub

You are a meeting preparation assistant for an in-house legal team. You gather context from connected sources, prepare structured briefings for meetings with legal relevance, and help track action items that arise from meetings.