dataset-intake-auditor
在新数据集接入前检查字段、单位、缺失率、异常值与可用性。;use for data, dataset, audit workflows;do not use for 伪造统计结果, 替代正式数据治理平台.
Best use case
dataset-intake-auditor is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
在新数据集接入前检查字段、单位、缺失率、异常值与可用性。;use for data, dataset, audit workflows;do not use for 伪造统计结果, 替代正式数据治理平台.
Teams using dataset-intake-auditor should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/dataset-intake-auditor/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How dataset-intake-auditor Compares
| Feature / Agent | dataset-intake-auditor | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
在新数据集接入前检查字段、单位、缺失率、异常值与可用性。;use for data, dataset, audit workflows;do not use for 伪造统计结果, 替代正式数据治理平台.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
Cursor vs Codex for AI Workflows
Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.
SKILL.md Source
# 数据集接入审计器
## 你是什么
你是“数据集接入审计器”这个独立 Skill,负责:在新数据集接入前检查字段、单位、缺失率、异常值与可用性。
## Routing
### 适合使用的情况
- 检查这个数据集能不能接入
- 给出字段和缺失率审计
- 输入通常包含:CSV/TSV 文件或目录
- 优先产出:数据集概览、字段摘要、后续动作
### 不适合使用的情况
- 不要伪造统计结果
- 不要替代正式数据治理平台
- 如果用户想直接执行外部系统写入、发送、删除、发布、变更配置,先明确边界,再只给审阅版内容或 dry-run 方案。
## 工作规则
1. 先把用户提供的信息重组成任务书,再输出结构化结果。
2. 缺信息时,优先显式列出“待确认项”,而不是直接编造。
3. 默认先给“可审阅草案”,再给“可执行清单”。
4. 遇到高风险、隐私、权限或合规问题,必须加上边界说明。
5. 如运行环境允许 shell / exec,可使用:
- `python3 "{baseDir}/scripts/run.py" --input <输入文件> --output <输出文件>`
6. 如当前环境不能执行脚本,仍要基于 `{baseDir}/resources/template.md` 与 `{baseDir}/resources/spec.json` 的结构直接产出文本。
## 标准输出结构
请尽量按以下结构组织结果:
- 数据集概览
- 字段摘要
- 缺失与异常
- 单位与口径风险
- 接入建议
- 后续动作
## 本地资源
- 规范文件:`{baseDir}/resources/spec.json`
- 输出模板:`{baseDir}/resources/template.md`
- 示例输入输出:`{baseDir}/examples/`
- 冒烟测试:`{baseDir}/tests/smoke-test.md`
## 安全边界
- 基于本地文件做只读分析。
- 默认只读、可审计、可回滚。
- 不执行高风险命令,不隐藏依赖,不伪造事实或结果。Related Skills
Payroll Compliance Auditor
Run a full payroll audit in under 10 minutes. Catches the errors that cost companies $845 per violation.
writing-credibility-auditor
Audit any piece of writing for missing citations, unsupported claims, logical fallacies, weasel words, and misleading statistics — then produce a structured credibility report with flagged excerpts, fallacy names, severity ratings, and suggested fixes. Use when a user asks to fact-check, audit, or review the reasoning in an article, essay, report, research summary, or argument.
Programmatic SEO Auditor Lite
Basic programmatic SEO audit — analyze page templates, crawl budget issues, and indexing health. Free version covers template analysis, crawl budget checklist, and basic content quality scoring.
MCP Security Auditor Lite
Free version — scan your MCP configuration for the top 3 security risks. Tool description injection, permission sprawl, and supply chain trust.
Ad Performance Auditor Lite
Free version — audit your ad campaigns across 3 key dimensions. Covers creative fatigue, budget allocation, and ROAS analysis.
semantic-consistency-auditor
Use semantic consistency auditor for academic writing workflows that need structured execution, explicit assumptions, and clear output boundaries.
pet-sitter-intake
Generate professional PDF client intake forms for pet sitting businesses. Use when a pet sitter, dog walker, pet boarder, or pet care professional needs a client intake form, onboarding questionnaire, or pet information sheet. Trigger phrases: "create intake form", "new client form for my pet sitting business", "pet sitter questionnaire", "boarding intake form". Supports fillable PDFs, custom color themes, multi-pet forms, home access sections, and service-specific templates.
Agent Security Auditor
Scans ERC-8004 agents for security vulnerabilities and generates comprehensive security reports.
Devvit Publishing Auditor
A specialized auditor for Reddit Devvit developers to verify app readiness before uploading to the Reddit servers. It ensures compliance with Devvit CLI v0.12.x and Reddit’s publishing standards.
hefestoai-auditor
Static code analysis tool. Detects security vulnerabilities, code smells, and complexity issues across 17 languages. All analysis runs locally — no code leaves your machine.
clauditor
Tamper-resistant audit watchdog for Clawdbot agents. Detects and logs suspicious filesystem activity with HMAC-chained evidence.
azure-storage-exposure-auditor
Identify publicly accessible Azure Storage accounts and misconfigured blob containers