data-processor

Process and transform arrays of data with common operations like filtering, mapping, and aggregation

242 stars

Best use case

data-processor is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Process and transform arrays of data with common operations like filtering, mapping, and aggregation

Process and transform arrays of data with common operations like filtering, mapping, and aggregation

Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.

Practical example

Example input

Use the "data-processor" skill to help with this workflow task. Context: Process and transform arrays of data with common operations like filtering, mapping, and aggregation

Example output

A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.

When to use this skill

  • Use this skill when you want a reusable workflow rather than writing the same prompt again and again.

When not to use this skill

  • Do not use this when you only need a one-off answer and do not need a reusable workflow.
  • Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/data-processor/SKILL.md --create-dirs "https://raw.githubusercontent.com/aiskillstore/marketplace/main/skills/artemisai/data-processor/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/data-processor/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How data-processor Compares

Feature / Agentdata-processorStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Process and transform arrays of data with common operations like filtering, mapping, and aggregation

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Data Processor Skill

A general-purpose data processing skill for transforming arrays of objects. This skill demonstrates the token efficiency benefits of code execution - instead of describing transformations in natural language, write code once and reuse it.

## What This Skill Does

Processes arrays of data with common transformations:
- Filter records based on conditions
- Map fields to new values
- Aggregate data (sum, average, count, etc.)
- Sort and group data
- Remove duplicates
- Merge datasets

## When to Use This Skill

Use this skill when you need to:
- Transform large datasets (hundreds or thousands of records)
- Apply consistent business logic to data
- Aggregate or summarize data
- Clean or normalize data
- Combine data from multiple sources

**Token Efficiency**: Processing 1000 records in code uses ~500 tokens. Describing the same operations in natural language would use ~50,000 tokens.

## Implementation

```javascript
/**
 * Data Processor - General purpose data transformation
 * @param {Array} data - Array of objects to process
 * @param {Object} operations - Operations to apply
 * @returns {Object} Processed data and statistics
 */
async function processData(data, operations = {}) {
  if (!Array.isArray(data)) {
    throw new Error('Data must be an array');
  }
  
  let result = [...data];
  const stats = {
    inputCount: data.length,
    operations: [],
  };
  
  // Filter operation
  if (operations.filter) {
    const beforeCount = result.length;
    result = result.filter(operations.filter);
    stats.operations.push({
      type: 'filter',
      recordsRemoved: beforeCount - result.length
    });
  }
  
  // Map operation (transform fields)
  if (operations.map) {
    result = result.map(operations.map);
    stats.operations.push({ type: 'map' });
  }
  
  // Sort operation
  if (operations.sort) {
    const { field, order = 'asc' } = operations.sort;
    result.sort((a, b) => {
      const aVal = a[field];
      const bVal = b[field];
      const comparison = aVal < bVal ? -1 : aVal > bVal ? 1 : 0;
      return order === 'asc' ? comparison : -comparison;
    });
    stats.operations.push({ type: 'sort', field, order });
  }
  
  // Aggregate operation
  if (operations.aggregate) {
    const { field, operation: aggOp } = operations.aggregate;
    const values = result.map(r => r[field]).filter(v => v != null);
    
    let aggregateResult;
    switch (aggOp) {
      case 'sum':
        aggregateResult = values.reduce((sum, v) => sum + v, 0);
        break;
      case 'average':
        aggregateResult = values.reduce((sum, v) => sum + v, 0) / values.length;
        break;
      case 'count':
        aggregateResult = values.length;
        break;
      case 'min':
        aggregateResult = Math.min(...values);
        break;
      case 'max':
        aggregateResult = Math.max(...values);
        break;
      default:
        throw new Error(`Unknown aggregate operation: ${aggOp}`);
    }
    
    stats.aggregateResult = {
      field,
      operation: aggOp,
      value: aggregateResult
    };
  }
  
  // Remove duplicates
  if (operations.unique) {
    const { field } = operations.unique;
    const seen = new Set();
    const beforeCount = result.length;
    result = result.filter(item => {
      const key = item[field];
      if (seen.has(key)) return false;
      seen.add(key);
      return true;
    });
    stats.operations.push({
      type: 'unique',
      field,
      duplicatesRemoved: beforeCount - result.length
    });
  }
  
  stats.outputCount = result.length;
  
  return {
    data: result,
    stats
  };
}

module.exports = processData;
```

## Examples

### Example 1: Filter and Sort

```javascript
const processData = require('/skills/data-processor.js');

const salesData = [
  { id: 1, amount: 150, status: 'completed' },
  { id: 2, amount: 200, status: 'pending' },
  { id: 3, amount: 175, status: 'completed' },
  { id: 4, amount: 225, status: 'completed' }
];

const result = await processData(salesData, {
  filter: (record) => record.status === 'completed',
  sort: { field: 'amount', order: 'desc' }
});

console.log(result);
// Output:
// {
//   data: [
//     { id: 4, amount: 225, status: 'completed' },
//     { id: 3, amount: 175, status: 'completed' },
//     { id: 1, amount: 150, status: 'completed' }
//   ],
//   stats: {
//     inputCount: 4,
//     operations: [
//       { type: 'filter', recordsRemoved: 1 },
//       { type: 'sort', field: 'amount', order: 'desc' }
//     ],
//     outputCount: 3
//   }
// }
```

### Example 2: Aggregate Data

```javascript
const processData = require('/skills/data-processor.js');

const orders = [
  { orderId: 1, total: 100 },
  { orderId: 2, total: 150 },
  { orderId: 3, total: 200 }
];

const result = await processData(orders, {
  aggregate: { field: 'total', operation: 'sum' }
});

console.log(result.stats.aggregateResult);
// Output: { field: 'total', operation: 'sum', value: 450 }
```

### Example 3: Complex Transformation

```javascript
const processData = require('/skills/data-processor.js');

const customers = [
  { name: '  John Doe  ', email: 'JOHN@EXAMPLE.COM', age: 30 },
  { name: 'Jane Smith', email: 'jane@example.com', age: 25 },
  { name: '  John Doe  ', email: 'JOHN@EXAMPLE.COM', age: 30 } // duplicate
];

const result = await processData(customers, {
  map: (customer) => ({
    name: customer.name.trim(),
    email: customer.email.toLowerCase(),
    age: customer.age
  }),
  unique: { field: 'email' },
  filter: (customer) => customer.age >= 25,
  sort: { field: 'age', order: 'asc' }
});

console.log(result.data);
// Output:
// [
//   { name: 'Jane Smith', email: 'jane@example.com', age: 25 },
//   { name: 'John Doe', email: 'john@example.com', age: 30 }
// ]
```

## Integration with MCP Tools

This skill works great in combination with MCP tools:

```javascript
// Fetch data from an MCP tool
const rawData = await callMCPTool('database__query', {
  query: 'SELECT * FROM customers WHERE created_date > "2024-01-01"'
});

// Process with the skill
const processData = require('/skills/data-processor.js');
const result = await processData(rawData, {
  filter: (r) => r.status === 'active',
  sort: { field: 'revenue', order: 'desc' },
  aggregate: { field: 'revenue', operation: 'sum' }
});

// Save results
await callMCPTool('storage__save', {
  key: 'processed_customers',
  value: result.data
});

// Return summary to agent (not full data)
return {
  processedRecords: result.stats.outputCount,
  totalRevenue: result.stats.aggregateResult.value
};
```

## Tips and Best Practices

1. **Save Intermediate Results**: For large datasets, save to `/workspace` after each major operation
2. **Return Summaries**: Send statistics to the agent, not full datasets
3. **Chain Operations**: Combine multiple operations for complex transformations
4. **Validate Input**: Always check data types and handle edge cases
5. **Reuse This Skill**: Save to `/skills` and use across multiple tasks

## Related Skills

- `validator` - Validate data before processing
- `exporter` - Export processed data to various formats
- `aggregator` - Advanced statistical aggregations

## Performance Notes

This skill can process:
- 1,000 records: < 50ms
- 10,000 records: < 200ms
- 100,000 records: < 2s

All operations use efficient JavaScript array methods with O(n) or O(n log n) complexity.

---

**Inspired by**: The Anthropic skills pattern for token-efficient data processing. See [Code Execution with MCP](https://www.anthropic.com/engineering/code-execution-with-mcp) for the philosophy behind this approach.

Related Skills

article-list-processor

242
from aiskillstore/marketplace

读取包含文章列表的 Markdown 文件,自动抓取原文内容并生成爆款文案。

vector-database-engineer

242
from aiskillstore/marketplace

Expert in vector databases, embedding strategies, and semantic search implementation. Masters Pinecone, Weaviate, Qdrant, Milvus, and pgvector for RAG applications, recommendation systems, and similar

sqlmap-database-pentesting

242
from aiskillstore/marketplace

This skill should be used when the user asks to "automate SQL injection testing," "enumerate database structure," "extract database credentials using sqlmap," "dump tables and columns...

sqlmap-database-penetration-testing

242
from aiskillstore/marketplace

This skill should be used when the user asks to "automate SQL injection testing," "enumerate database structure," "extract database credentials using sqlmap," "dump tables and columns from a vulnerable database," or "perform automated database penetration testing." It provides comprehensive guidance for using SQLMap to detect and exploit SQL injection vulnerabilities.

gdpr-data-handling

242
from aiskillstore/marketplace

Implement GDPR-compliant data handling with consent management, data subject rights, and privacy by design. Use when building systems that process EU personal data, implementing privacy controls, or conducting GDPR compliance reviews.

datadog-automation

242
from aiskillstore/marketplace

Automate Datadog tasks via Rube MCP (Composio): query metrics, search logs, manage monitors/dashboards, create events and downtimes. Always search tools first for current schemas.

database-optimizer

242
from aiskillstore/marketplace

Expert database optimizer specializing in modern performance tuning, query optimization, and scalable architectures. Masters advanced indexing, N+1 resolution, multi-tier caching, partitioning strategies, and cloud database optimization. Handles complex query analysis, migration strategies, and performance monitoring. Use PROACTIVELY for database optimization, performance issues, or scalability challenges.

database-migrations-sql-migrations

242
from aiskillstore/marketplace

SQL database migrations with zero-downtime strategies for PostgreSQL, MySQL, SQL Server

database-migrations-migration-observability

242
from aiskillstore/marketplace

Migration monitoring, CDC, and observability infrastructure

database-design

242
from aiskillstore/marketplace

Database design principles and decision-making. Schema design, indexing strategy, ORM selection, serverless databases.

database-cloud-optimization-cost-optimize

242
from aiskillstore/marketplace

You are a cloud cost optimization expert specializing in reducing infrastructure expenses while maintaining performance and reliability. Analyze cloud spending, identify savings opportunities, and implement cost-effective architectures across AWS, Azure, and GCP.

database-architect

242
from aiskillstore/marketplace

Expert database architect specializing in data layer design from scratch, technology selection, schema modeling, and scalable database architectures. Masters SQL/NoSQL/TimeSeries database selection, normalization strategies, migration planning, and performance-first design. Handles both greenfield architectures and re-architecture of existing systems. Use PROACTIVELY for database architecture, technology selection, or data modeling decisions.