aiml-validation-framework

AI/ML medical device validation skill implementing FDA's GMLP principles

509 stars

Best use case

aiml-validation-framework is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

AI/ML medical device validation skill implementing FDA's GMLP principles

Teams using aiml-validation-framework should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/aiml-validation-framework/SKILL.md --create-dirs "https://raw.githubusercontent.com/a5c-ai/babysitter/main/library/specializations/domains/science/biomedical-engineering/skills/aiml-validation-framework/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/aiml-validation-framework/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How aiml-validation-framework Compares

Feature / Agent	aiml-validation-framework	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

AI/ML medical device validation skill implementing FDA's GMLP principles

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# AI/ML Validation Framework Skill

## Purpose

The AI/ML Validation Framework Skill supports validation of AI/ML-enabled medical devices per FDA Good Machine Learning Practice (GMLP) principles, addressing data quality, model performance, and predetermined change control.

## Capabilities

- Training data quality assessment
- Ground truth labeling validation
- Model performance metrics calculation (AUC, sensitivity, specificity)
- Subgroup performance analysis
- Bias and fairness evaluation
- Predetermined change control plan (PCCP) templates
- Clinical validation study design
- Locked algorithm vs. adaptive documentation
- Model explainability documentation
- Performance monitoring planning
- Real-world performance tracking

## Usage Guidelines

### When to Use
- Validating AI/ML algorithms
- Assessing training data quality
- Planning clinical validation studies
- Preparing FDA AI/ML submissions

### Prerequisites
- Algorithm development complete
- Training/test datasets curated
- Ground truth established
- Intended use clearly defined

### Best Practices
- Document data management practices
- Validate on diverse populations
- Plan for performance monitoring
- Consider predetermined change control

## Process Integration

This skill integrates with the following processes:
- AI/ML Medical Device Development
- Software Verification and Validation
- Clinical Evaluation Report Development
- Post-Market Surveillance System Implementation

## Dependencies

- FDA AI/ML guidance
- GMLP principles
- Fairness toolkits (AIF360, Fairlearn)
- Statistical analysis tools
- Clinical study resources

## Configuration

```yaml
aiml-validation-framework:
  algorithm-types:
    - locked
    - adaptive
    - continuously-learning
  performance-metrics:
    - AUC
    - sensitivity
    - specificity
    - PPV
    - NPV
  subgroup-categories:
    - age
    - sex
    - race
    - disease-severity
```

## Output Artifacts

- Data management documentation
- Algorithm description documents
- Performance reports
- Bias/fairness assessments
- PCCP documents
- Clinical validation protocols
- Monitoring plans
- FDA submission sections

## Quality Criteria

- Training data quality documented
- Ground truth methodology validated
- Performance meets clinical requirements
- Subgroup performance acceptable
- Bias assessments completed
- PCCP appropriate for algorithm type

Related Skills

aiml-security

509

from a5c-ai/babysitter

AI/ML model security testing and adversarial research capabilities. Generate adversarial examples, test model robustness, perform model extraction attacks, test for data poisoning, analyze model fairness, and support ART framework integration.

contract-test-framework

509

from a5c-ai/babysitter

Consumer-driven contract testing for SDK-API compatibility. Generate Pact consumer tests, verify provider contracts, configure Pact broker, and implement can-i-deploy checks.

cli-framework-builder

509

from a5c-ai/babysitter

Build command-line interfaces for SDK interaction

Mobile Testing Frameworks

509

from a5c-ai/babysitter

Comprehensive mobile testing framework expertise

unreal-gamesframework

509

from a5c-ai/babysitter

Unreal Engine Gameplay Ability System (GAS) skill for attributes, abilities, and gameplay effects.

ethical-framework-application

509

from a5c-ai/babysitter

Apply multiple ethical frameworks (deontological, consequentialist, virtue ethics, care ethics) systematically to moral problems and generate reasoned recommendations