admin-documents

Document management, LLM pipeline, anonymization, Q&A generation, versioning

181 stars

bymajiayu000

View on GitHub Installation ↓

Best use case

admin-documents is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Document management, LLM pipeline, anonymization, Q&A generation, versioning

Teams using admin-documents should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/admin-documents/SKILL.md --create-dirs "https://raw.githubusercontent.com/majiayu000/claude-skill-registry/main/skills/data/admin-documents/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/admin-documents/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How admin-documents Compares

Feature / Agent	admin-documents	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Document management, LLM pipeline, anonymization, Q&A generation, versioning

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Admin Documents Module — CEI-001

## Document Pipeline Architecture

```python
# app/services/document_pipeline.py
from typing import List, Dict, Any, AsyncGenerator
from openai import AsyncOpenAI
import tiktoken

class DocumentPipelineService:
    def __init__(self, openai_key: str):
        self.client = AsyncOpenAI(api_key=openai_key)
        self.tokenizer = tiktoken.encoding_for_model("gpt-4")
    
    async def process_document(
        self,
        content: str,
        config: PipelineConfig
    ) -> Dict[str, Any]:
        """Run full pipeline on document"""
        
        result = {
            "original": content,
            "augmented": content,
            "qa_pairs": [],
            "chunks": [],
            "stats": {}
        }
        
        # 1. Anonymization
        if "anonymize" in config.transformations:
            result["augmented"] = await self._anonymize(result["augmented"])
        
        # 2. Whitelabel (remove specific references)
        if "whitelabel" in config.transformations:
            result["augmented"] = await self._whitelabel(result["augmented"])
        
        # 3. Normalize (tone, terminology)
        if "normalize" in config.transformations:
            result["augmented"] = await self._normalize(result["augmented"])
        
        # 4. Enrich summary
        if "enrich_summary" in config.transformations:
            summary = await self._generate_summary(result["augmented"])
            result["augmented"] = f"SUMMARY:\n{summary}\n\n{result['augmented']}"
        
        # 5. Generate Q&A
        if "enrich_qa" in config.transformations:
            result["qa_pairs"] = await self._generate_qa(result["augmented"])
        
        # 6. Chunk for RAG
        if "segment" in config.transformations:
            result["chunks"] = self._chunk_text(
                result["augmented"],
                chunk_size=config.chunk_size,
                overlap=config.chunk_overlap
            )
        
        return result
    
    async def _anonymize(self, content: str) -> str:
        """Remove PII and client-specific data"""
        prompt = """Anonymize this document:
- Replace company names with "Company X", "Company Y"
- Replace person names with "Manager", "User", etc.
- Keep structure and meaning
- Return only anonymized text

Content:
{content}"""
        
        response = await self.client.messages.create(
            model="gpt-4-turbo-preview",
            max_tokens=2000,
            messages=[{"role": "user", "content": prompt.format(content=content)}]
        )
        return response.content[0].text
    
    async def _whitelabel(self, content: str) -> str:
        """Neutralize client/tool-specific references"""
        prompt = """Neutralize this document for white-label use:
- "Our client X" → "manufacturing companies"
- "Genius ERP" → "ERP systems"
- "Our methodology" → "industry best practices"
- Keep exact same information, just generalized

Content:
{content}"""
        
        response = await self.client.messages.create(
            model="gpt-4-turbo-preview",
            max_tokens=2000,
            messages=[{"role": "user", "content": prompt.format(content=content)}]
        )
        return response.content[0].text
    
    async def _normalize(self, content: str) -> str:
        """Normalize tone, terminology, structure"""
        prompt = """Normalize this document for consistent style:
- Standardize terminology (use "ERP" not "ERP systems", "system")
- Consistent tone (professional, accessible)
- Fix grammar and clarity
- Maintain all information

Content:
{content}"""
        
        response = await self.client.messages.create(
            model="gpt-4-turbo-preview",
            max_tokens=2000,
            messages=[{"role": "user", "content": prompt.format(content=content)}]
        )
        return response.content[0].text
    
    async def _generate_summary(self, content: str) -> str:
        """Generate executive summary"""
        prompt = f"""Generate a 2-3 sentence executive summary:

{content}"""
        
        response = await self.client.messages.create(
            model="gpt-4-turbo-preview",
            max_tokens=300,
            messages=[{"role": "user", "content": prompt}]
        )
        return response.content[0].text
    
    async def _generate_qa(self, content: str, pairs_per_section: int = 3) -> List[Dict]:
        """Generate Q&A pairs for better RAG"""
        prompt = f"""Generate {pairs_per_section} Q&A pairs from this content:

{content}

Format as JSON:
[
  {{"question": "?", "answer": "?"}},
  ...
]"""
        
        response = await self.client.messages.create(
            model="gpt-4-turbo-preview",
            max_tokens=1000,
            messages=[{"role": "user", "content": prompt}]
        )
        
        try:
            import json
            return json.loads(response.content[0].text)
        except:
            return []
    
    def _chunk_text(self, content: str, chunk_size: int = 800, overlap: int = 100) -> List[str]:
        """Chunk text smartly"""
        chunks = []
        paragraphs = content.split('\n\n')
        
        current_chunk = ""
        for para in paragraphs:
            if len(current_chunk) + len(para) < chunk_size:
                current_chunk += para + "\n\n"
            else:
                if current_chunk:
                    chunks.append(current_chunk.strip())
                current_chunk = para
        
        if current_chunk:
            chunks.append(current_chunk.strip())
        
        return chunks
```

## Admin API Routes

```python
# app/api/routes/admin_documents.py
from fastapi import APIRouter, UploadFile, File, Depends, HTTPException
from sqlalchemy.ext.asyncio import AsyncSession

from app.api.deps import get_db, get_admin_user
from app.schemas.admin_document import PipelineConfig, DocumentResponse
from app.services.document_pipeline import DocumentPipelineService

router = APIRouter(prefix="/api/admin/documents", tags=["admin"])

@router.post("/upload")
async def upload_document(
    file: UploadFile = File(...),
    db: AsyncSession = Depends(get_db),
    admin = Depends(get_admin_user)
) -> DocumentResponse:
    """Upload document (admin only)"""
    
    # Save file
    content = await file.read()
    
    # Create document record
    document = Document(
        title=file.filename,
        source_filename=file.filename,
        source_mimetype=file.content_type,
        status="draft",
        created_by=admin.id
    )
    db.add(document)
    await db.commit()
    
    return DocumentResponse.from_orm(document)

@router.post("/{doc_id}/pipeline")
async def start_pipeline(
    doc_id: str,
    config: PipelineConfig,
    db: AsyncSession = Depends(get_db),
    admin = Depends(get_admin_user)
):
    """Start LLM pipeline (admin only)"""
    
    # Get document
    document = await db.get(Document, doc_id)
    if not document:
        raise HTTPException(status_code=404, detail="Document not found")
    
    # Mark processing
    document.status = "processing"
    await db.commit()
    
    # Run pipeline
    service = DocumentPipelineService(settings.OPENAI_API_KEY)
    result = await service.process_document(content, config)
    
    # Save version
    version = DocumentVersion(
        document_id=doc_id,
        version_number=document.current_version + 1,
        original_content=content,
        augmented_content=result["augmented"],
        generated_qa=result["qa_pairs"],
        pipeline_config=config.dict()
    )
    db.add(version)
    
    # Update document
    document.current_version += 1
    document.status = "review"
    
    await db.commit()
    
    return {"status": "completed", "version": version.version_number}

@router.post("/{doc_id}/publish")
async def publish_document(
    doc_id: str,
    db: AsyncSession = Depends(get_db),
    admin = Depends(get_admin_user)
):
    """Publish to Weaviate (admin only)"""
    
    document = await db.get(Document, doc_id)
    if not document:
        raise HTTPException(status_code=404, detail="Document not found")
    
    # Get current version
    version = await db.get(DocumentVersion, {"document_id": doc_id, "version_number": document.current_version})
    
    # Index chunks
    rag_service = RAGService(settings.WEAVIATE_HOST)
    chunk_uuids = await rag_service.index_document(
        doc_id,
        version.augmented_content
    )
    
    # Update document
    document.status = "published"
    document.published_at = datetime.utcnow()
    
    await db.commit()
    
    return {"status": "published", "chunks_indexed": len(chunk_uuids)}
```

---

---
name: typescript-patterns
description: TypeScript type safety, enums, generics, custom hooks, form validation
---

# TypeScript Patterns — CEI-001

## Type Safety Strictness

```typescript
// tsconfig.json
{
  "compilerOptions": {
    "strict": true,
    "noImplicitAny": true,
    "noImplicitThis": true,
    "strictNullChecks": true,
    "strictFunctionTypes": true,
    "strictBindCallApply": true,
    "strictPropertyInitialization": true,
    "noUnusedLocals": true,
    "noUnusedParameters": true,
    "noImplicitReturns": true,
    "noFallthroughCasesInSwitch": true
  }
}
```

## Enums for Constants

```typescript
// types/evaluation.ts
export enum ModuleType {
  VISION = 'vision',
  ORGANIZATION = 'organization',
  DATA = 'data',
  INFRASTRUCTURE = 'infrastructure',
  RESOURCES = 'resources',
  PITFALLS = 'pitfalls',
  IMPLEMENTATION = 'implementation',
  POST = 'post'
}

export enum QuestionType {
  YESNO = 'yesno',
  SCALE = 'scale',
  MULTIPLE = 'multiple'
}

export enum EvaluationStatus {
  IN_PROGRESS = 'in_progress',
  COMPLETED = 'completed',
  DRAFT = 'draft'
}
```

## Discriminated Unions

```typescript
// types/api.ts
type ApiResponse<T> =
  | { type: 'success'; data: T }
  | { type: 'error'; error: { code: string; message: string } }
  | { type: 'loading' };

// Type-safe usage
function handleResponse<T>(response: ApiResponse<T>) {
  if (response.type === 'success') {
    console.log(response.data); // T is available
  } else if (response.type === 'error') {
    console.log(response.error.code); // error is available
  }
}
```

## Generics

```typescript
// API client with generics
interface ApiClient {
  get<T>(url: string): Promise<T>;
  post<T, D>(url: string, data: D): Promise<T>;
  put<T, D>(url: string, id: string, data: D): Promise<T>;
}

// Usage
const users = await api.get<User[]>('/api/users');
const created = await api.post<User, CreateUserData>('/api/users', userData);
```

## Custom Hooks with Types

```typescript
// hooks/usePagination.ts
interface UsePaginationOptions {
  pageSize: number;
  initialPage?: number;
}

interface UsePaginationState {
  page: number;
  total: number;
  pageSize: number;
}

export function usePagination({
  pageSize,
  initialPage = 1
}: UsePaginationOptions) {
  const [state, setState] = useState<UsePaginationState>({
    page: initialPage,
    total: 0,
    pageSize
  });

  const nextPage = () => setState(prev => ({
    ...prev,
    page: Math.min(prev.page + 1, Math.ceil(prev.total / pageSize))
  }));

  const previousPage = () => setState(prev => ({
    ...prev,
    page: Math.max(prev.page - 1, 1)
  }));

  return { ...state, nextPage, previousPage };
}
```

## Form Validation with Zod

```typescript
// validation/evaluation.ts
import { z } from 'zod';

export const answerSchema = z.object({
  questionId: z.string().uuid(),
  answer: z.enum(['oui', 'non', 'partiellement']),
  comment: z.string().optional()
});

export type Answer = z.infer<typeof answerSchema>;

export const evaluationSchema = z.object({
  companyId: z.string().min(1),
  answers: z.array(answerSchema)
});

export type EvaluationData = z.infer<typeof evaluationSchema>;

// Usage with React Hook Form
import { useForm } from 'react-hook-form';
import { zodResolver } from '@hookform/resolvers/zod';

export function AnswerForm() {
  const { control, handleSubmit } = useForm<Answer>({
    resolver: zodResolver(answerSchema)
  });
  
  return (
    <form onSubmit={handleSubmit(onSubmit)}>
      {/* Form fields */}
    </form>
  );
}
```

## Utility Types

```typescript
// Type helpers
type Readonly<T> = {
  readonly [P in keyof T]: T[P];
};

type Partial<T> = {
  [P in keyof T]?: T[P];
};

type Record<K extends string | number | symbol, T> = {
  [P in K]: T;
};

// Usage
type UserResponse = Readonly<User>;
type UserUpdate = Partial<User>;
type UserMap = Record<string, User>;
```

## Async Types

```typescript
type ApiResult<T> = Promise<T | null>;

async function fetchUser(id: string): ApiResult<User> {
  try {
    const response = await api.get<User>(`/api/users/${id}`);
    return response;
  } catch (error) {
    console.error(error);
    return null;
  }
}
```

## Conventions

- Interfaces for public APIs, types for internal
- Enums for constants instead of `as const`
- Generics for reusable logic
- Discriminated unions for variants
- Zod for runtime validation
- Strict mode always enabled
- No `any` type allowed
- Export types from `types/` folder

Related Skills

administration

181

from majiayu000/claude-skill-registry

How to monitor usage, track costs, configure analytics, and measure ROI for Claude Code. Use when user asks about monitoring, telemetry, metrics, costs, analytics, or OpenTelemetry.

administering-linux

181

from majiayu000/claude-skill-registry

Manage Linux systems covering systemd services, process management, filesystems, networking, performance tuning, and troubleshooting. Use when deploying applications, optimizing server performance, diagnosing production issues, or managing users and security on Linux servers.

admin

181

from majiayu000/claude-skill-registry

Admin panel - RBAC, config, admin tools. Use when building admin UI.

admin-wsl

181

from majiayu000/claude-skill-registry

WSL2 Ubuntu administration from Linux side. Profile-aware - reads preferences from Windows-side profile at /mnt/c/Users/{WIN_USER}/.admin/profiles/{hostname}.json Use when: Inside WSL for apt packages, Docker, Python/uv, shell configs, systemd. Coordinates with admin-windows via shared profile ON THE WINDOWS SIDE.

admin-windows

181

from majiayu000/claude-skill-registry

Windows system administration with PowerShell 7.x. Profile-aware - reads your preferences for package managers (scoop vs winget), paths, and installed tools. Use when: Windows-specific admin tasks, PowerShell automation, PATH configuration, package installation, bash-to-PowerShell translation.

admin-unix

181

from majiayu000/claude-skill-registry

Native macOS and Linux administration (non-WSL). Profile-aware - reads preferences from ~/.admin/profiles/{hostname}.json. Use when: macOS/Linux system admin, Homebrew (macOS), apt (Linux), services. NOT for WSL - use admin-wsl instead.

admin-panel-builder

181

from majiayu000/claude-skill-registry

Expert assistant for creating and maintaining admin panel pages in the KR92 Bible Voice project. Use when creating admin pages, building admin components, integrating with admin navigation, or adding admin features.

admin-mcp

181

from majiayu000/claude-skill-registry

MCP server management for Claude Desktop. Profile-aware - reads MCP server inventory from profile.mcp.servers{} and config path from profile.paths.claudeConfig. Use when: installing MCP servers, configuring Claude Desktop, troubleshooting MCP issues.

admin-interface-rules

181

from majiayu000/claude-skill-registry

Rules for the Admin interface functionalities

admin-infra-vultr

181

from majiayu000/claude-skill-registry

Deploys infrastructure on Vultr with Cloud Compute instances, High-Frequency servers, and VPCs. Excellent value with Kubernetes autoscaling support and global data centers. Use when: setting up Vultr infrastructure, deploying cloud compute or high-frequency instances, configuring firewalls, needing good price/performance with global reach. Keywords: vultr, vultr-cli, VPS, cloud compute, high-frequency, firewall, VPC, kubernetes autoscale, infrastructure

admin-infra-oci

181

from majiayu000/claude-skill-registry

Deploys infrastructure on Oracle Cloud Infrastructure (OCI) with ARM64 instances (Always Free tier eligible). Handles compartments, VCNs, subnets, security lists, and compute instances. Use when: setting up Oracle Cloud infrastructure, deploying ARM64 instances, troubleshooting OUT_OF_HOST_CAPACITY errors, optimizing for Always Free tier. Keywords: oracle cloud, OCI, ARM64, VM.Standard.A1.Flex, Always Free tier, OUT_OF_HOST_CAPACITY, oci compartment, oci vcn

admin-infra-linode

181

from majiayu000/claude-skill-registry

Deploys infrastructure on Linode (Akamai Cloud) with Linodes, Firewalls, and VLANs. Strong Kubernetes support with Cluster Autoscaler and Akamai edge network integration. Use when: setting up Linode/Akamai infrastructure, deploying Linodes, configuring firewalls, needing Kubernetes autoscaling, wanting Akamai CDN integration. Keywords: linode, akamai, linode-cli, VPS, dedicated CPU, firewall, VLAN, kubernetes autoscale, infrastructure