File Upload Processor

## Overview

25 stars

Best use case

File Upload Processor is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

## Overview

Teams using File Upload Processor should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/file-upload-processor/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/TerminalSkills/skills/file-upload-processor/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/file-upload-processor/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How File Upload Processor Compares

Feature / Agent	File Upload Processor	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

## Overview

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# File Upload Processor

## Overview

Builds secure file upload endpoints for web applications. Handles multipart form uploads, presigned URL generation for large files, file type validation via magic bytes (not just extensions), size limits, cloud storage integration (S3, GCS, R2), and upload status tracking. Produces production-ready code with streaming (no temp files on disk for small files).

## Instructions

### 1. Choose Upload Strategy

Based on file size:

- **Small files (< 10MB)**: Stream through server to storage. Simple, one request.
- **Medium files (10-100MB)**: Server-side streaming with progress tracking.
- **Large files (> 100MB)**: Presigned multipart upload — client uploads directly to S3.

### 2. File Validation

Always validate by magic bytes, never trust file extensions:

```typescript
const MAGIC_BYTES = {
  'image/jpeg': [0xFF, 0xD8, 0xFF],
  'image/png': [0x89, 0x50, 0x4E, 0x47],
  'image/webp': [0x52, 0x49, 0x46, 0x46], // + "WEBP" at offset 8
  'application/pdf': [0x25, 0x50, 0x44, 0x46],
  'video/mp4': null, // Check for "ftyp" at offset 4
  'video/webm': [0x1A, 0x45, 0xDF, 0xA3],
};

function detectFileType(buffer: Buffer): string | null {
  // Read first 12 bytes
  // Match against known signatures
  // Return MIME type or null if unknown
}
```

Additional validation:
- Check file size BEFORE reading the full body (Content-Length header)
- Set hard limits on multer/busboy to abort oversized uploads
- Scan for double extensions: `image.jpg.exe`
- Reject files with null bytes in filename

### 3. Storage Integration

```typescript
// S3-compatible storage client
class StorageService {
  async upload(key: string, stream: Readable, contentType: string): Promise<string>
  async getPresignedUploadUrl(key: string, contentType: string, expiresIn: number): Promise<string>
  async getPresignedDownloadUrl(key: string, expiresIn: number): Promise<string>
  async initiateMultipartUpload(key: string): Promise<{ uploadId: string, parts: PresignedPart[] }>
  async completeMultipartUpload(key: string, uploadId: string, parts: CompletedPart[]): Promise<void>
  async delete(key: string): Promise<void>
}
```

Key naming convention: `{type}/{userId}/{fileId}/{filename}`

### 4. Upload Status Tracking

Database model:

```
files:
  id: UUID
  user_id: UUID
  original_name: string
  storage_key: string
  mime_type: string
  size_bytes: bigint
  status: enum(uploading, uploaded, processing, processed, failed)
  variants: jsonb (null until processed)
  error: text (null unless failed)
  created_at: timestamp
  updated_at: timestamp
```

### 5. API Endpoints

```
POST   /api/files/upload          — Multipart form upload (< 100MB)
POST   /api/files/presign         — Get presigned URL for large file upload
POST   /api/files/multipart/init  — Start multipart upload (> 100MB)
POST   /api/files/multipart/complete — Complete multipart upload
GET    /api/files/:id/status      — Get upload/processing status
GET    /api/files/:id/download    — Get presigned download URL
DELETE /api/files/:id             — Soft delete file
```

## Examples

### Example 1: Express Upload Endpoint

**Prompt**: "Create a file upload endpoint for my Express app. Accept images and PDFs, store in S3."

**Output**: Upload route with multer streaming, magic-byte validation, S3 upload, database record creation, and error handling. Returns file ID for status polling.

### Example 2: Presigned Upload for Large Videos

**Prompt**: "Users upload videos up to 2GB. I don't want them going through my server."

**Output**: Presigned URL generation endpoint, client-side upload code with progress tracking, multipart upload for files > 100MB, and a webhook endpoint to confirm upload completion and trigger processing.

## Guidelines

- **Stream, don't buffer** — never load entire files into memory
- **Validate magic bytes** — file extensions lie, magic bytes don't
- **Set upload limits at every layer** — nginx, reverse proxy, and application
- **Generate unique storage keys** — include user ID and file ID, never use original filename as key
- **Return immediately** — upload ack should be instant, processing happens async
- **Clean up on failure** — if DB write fails, delete the S3 object; if S3 fails, don't create DB record
- **Rate limit uploads** — per user, per time window (e.g., 20 uploads per hour)

Related Skills

box-cloud-filesystem

from ComeOnOliver/skillshub

Cloud filesystem operations via Box CLI. Use when the user mentions Box, cloud files, cloud storage, uploading to the cloud, sharing files, document management, or syncing project files offsite. Trigger with "upload to box", "save to cloud", "pull from box", "search my box files", "share this file", "box sync", "cloud backup", or "box filesystem".

batch-file-processor

from ComeOnOliver/skillshub

Batch File Processor - Auto-activating skill for Business Automation. Triggers on: batch file processor, batch file processor Part of the Business Automation skill category.

defold-proto-file-editing

from ComeOnOliver/skillshub

Creates and edits Defold resource and component files that use Protobuf Text Format (.collection, .go, .atlas, .sprite, .gui, .collisionobject, .convexshape, .label, .font, .material, .model, .mesh, .particlefx, .sound, .camera, .factory, .collectionfactory, .collectionproxy, .tilemap, .tilesource, .objectinterpolation). Use when asked to create, modify, or configure any Defold proto text format file.

filesystem-context

from ComeOnOliver/skillshub

This skill should be used when the user asks to "offload context to files", "implement dynamic context discovery", "use filesystem for agent memory", "reduce context window bloat", or mentions file-based context management, tool output persistence, agent scratch pads, or just-in-time context loading.

recipe-find-large-files

from ComeOnOliver/skillshub

Identify large Google Drive files consuming storage quota.

gws-workflow-file-announce

from ComeOnOliver/skillshub

Google Workflow: Announce a Drive file in a Chat space.

write-coding-standards-from-file

from ComeOnOliver/skillshub

Write a coding standards document for a project using the coding styles from the file(s) and/or folder(s) passed as arguments in the prompt.

update-markdown-file-index

from ComeOnOliver/skillshub

Update a markdown file section with an index/table of files from a specified folder.

multi-stage-dockerfile

from ComeOnOliver/skillshub

Create optimized multi-stage Dockerfiles for any language or framework

Filesystem Navigation

from ComeOnOliver/skillshub

Guidelines for systematically exploring and understanding directory structures.

Config File Recognition

from ComeOnOliver/skillshub

How to find, read, and audit configuration files — includes concrete investigation steps like grepping for env vars, checking for hardcoded secrets, and mapping external service dependencies.

claude-code-history-files-finder

from ComeOnOliver/skillshub

Finds and recovers content from Claude Code session history files. This skill should be used when searching for deleted files, tracking changes across sessions, analyzing conversation history, or recovering code from previous Claude interactions. Triggers include mentions of "session history", "recover deleted", "find in history", "previous conversation", or ".claude/projects".