estimation-patterns

Practical estimation techniques for software tasks — methods comparison, decomposition, complexity multipliers, buffer calculation, bias awareness, and communication strategies. Use when estimating features, sprint planning, or presenting timelines to stakeholders.

7 stars

Best use case

estimation-patterns is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Practical estimation techniques for software tasks — methods comparison, decomposition, complexity multipliers, buffer calculation, bias awareness, and communication strategies. Use when estimating features, sprint planning, or presenting timelines to stakeholders.

Teams using estimation-patterns should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/estimation-patterns/SKILL.md --create-dirs "https://raw.githubusercontent.com/wpank/ai/main/skills/meta/estimation-patterns/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/estimation-patterns/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How estimation-patterns Compares

Feature / Agentestimation-patternsStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Practical estimation techniques for software tasks — methods comparison, decomposition, complexity multipliers, buffer calculation, bias awareness, and communication strategies. Use when estimating features, sprint planning, or presenting timelines to stakeholders.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Estimation Patterns (Meta-Skill)

Systematic approaches for producing accurate, defensible software estimates.


## Installation

### OpenClaw / Moltbot / Clawbot

```bash
npx clawhub@latest install estimation-patterns
```


---

## When to Use

- Estimating a feature, bug fix, or project timeline
- Breaking down work for sprint planning or roadmap forecasting
- Presenting estimates to stakeholders or product managers
- Reviewing historical accuracy to calibrate future estimates
- Noticing a pattern of missed deadlines or blown budgets

---

## Estimation Methods

Choose the method that matches your context and audience.

| Method | Best For | Granularity | Pros | Cons |
|----------------------|-------------------------------|-----------------|-----------------------------------------------|-----------------------------------------------|
| T-Shirt Sizing | Roadmap planning, backlog grooming | XS, S, M, L, XL | Fast, low-friction, good for relative ranking | Not actionable for scheduling |
| Story Points | Sprint planning, team velocity | Fibonacci (1-21) | Abstracts away individual speed, tracks velocity | Meaningless outside the team, gaming risk |
| Time-Based | Client quotes, contractor work | Hours / days | Universally understood, maps to budgets | Anchoring bias, implies false precision |
| Three-Point | High-uncertainty tasks | Min / likely / max | Captures uncertainty range, enables PERT | Requires discipline to set honest bounds |
| Reference Comparison | Recurring task types | Relative to past | Grounded in real data, hard to argue with | Requires historical records, breaks on novelty |

**Three-point formula (PERT):**

```
Expected = (Optimistic + 4 x Likely + Pessimistic) / 6
Standard Deviation = (Pessimistic - Optimistic) / 6
```

Use the standard deviation to express confidence ranges (e.g., "3-5 days at 68% confidence, 2-6 days at 95%").

---

## Task Decomposition

Break work down until every sub-task is **< 4 hours** of effort. Anything larger hides unknowns.

| Level | Example | Target Size |
|----------------|-------------------------------------------|---------------|
| Epic | User authentication system | 2-6 weeks |
| Feature | OAuth2 login with Google | 3-10 days |
| Task | Implement callback handler | 1-3 days |
| Sub-task | Parse and validate OAuth token | 1-4 hours |
| Atomic step | Write token expiry check function | 30-90 minutes |

**Decomposition checklist:**

1. Can I describe what "done" looks like in one sentence?
2. Is there exactly one unknown, or zero?
3. Could a teammate pick this up without a walkthrough?
4. Is it under 4 hours? If no — split again.

**If you cannot decompose a task**, it signals a spike is needed. Timebox the spike (2-4 hours), then re-estimate.

---

## Complexity Multipliers

Apply these multipliers to your base estimate when complexity factors are present. Multipliers stack multiplicatively.

| Factor | Multiplier | Rationale |
|--------------------------|------------|----------------------------------------------------|
| New technology / stack | 1.5x | Learning curve, unexpected gotchas, doc-hunting |
| Unclear requirements | 2.0x | Discovery work, rework cycles, stakeholder alignment |
| Legacy code | 1.5x | Undocumented behavior, fragile tests, hidden coupling |
| Cross-team dependency | 1.5x | Coordination overhead, blocking, API negotiation |
| First-time task | 2.0x | No reference point, unknown unknowns dominate |
| Regulatory / compliance | 1.5x | Audit trails, review gates, documentation overhead |

**Example:** A 2-day base estimate on legacy code (1.5x) with unclear requirements (2.0x) becomes `2 x 1.5 x 2.0 = 6 days`.

**Rule:** Never apply more than 3 multipliers — if that many factors converge, the task needs a spike or a scope reduction, not a bigger number.

---

## Buffer Calculation

Raw estimates are point predictions. Reality is a distribution.

| Buffer Type | Rule of Thumb | When to Apply |
|------------------------|-------------------------|-------------------------------------------------|
| Known unknowns | +20% of total estimate | Integration points, third-party APIs, minor gaps |
| Unknown unknowns | +50% of total estimate | New domain, first release, greenfield system |
| Team velocity factor | / focus ratio (e.g., 0.7) | Account for meetings, reviews, context switching |
| Sequential dependency | +10% per handoff | Each team/person boundary adds coordination drag |

**Effective estimate formula:**

```
Effective = (Base Estimate x Multipliers) / Focus Ratio + Buffer
```

**Focus ratio guidelines:**

| Scenario | Typical Focus Ratio |
|-----------------------------------|---------------------|
| Dedicated to one project | 0.75-0.85 |
| Split across 2 projects | 0.50-0.60 |
| On-call rotation active | 0.60-0.70 |
| Heavy meeting load (> 3h/day) | 0.45-0.55 |

---

## Historical Calibration

Track actual vs estimated to improve over time. This is the single most effective way to get better at estimation.

**Tracking table:**

| Task | Estimated | Actual | Ratio (A/E) | Notes |
|---------------------|-----------|--------|-------------|--------------------------|
| Auth flow | 3 days | 5 days | 1.67 | OAuth docs were outdated |
| Dashboard charts | 5 days | 4 days | 0.80 | Reused existing component |
| DB migration | 2 days | 6 days | 3.00 | Discovered data quality issues |

**Accuracy ratio:** Calculate your rolling average of `Actual / Estimated` over the last 10-20 tasks.

- Ratio **< 0.8** — you're overestimating (sandbagging or excessive buffers)
- Ratio **0.8-1.2** — well calibrated
- Ratio **> 1.2** — you're underestimating (apply the ratio as a correction factor)

**Calibration action:** Multiply future estimates by your rolling accuracy ratio until it converges toward 1.0.

---

## Common Estimation Biases

Recognize these cognitive traps — awareness alone reduces their effect.

| Bias | Description | Mitigation |
|---------------------|----------------------------------------------------------|---------------------------------------------------|
| Planning Fallacy | Assuming best-case scenario despite past evidence | Use historical data, not intuition |
| Anchoring | First number heard dominates all subsequent estimates | Estimate independently before discussing |
| Optimism Bias | "It'll be simpler than last time" | Apply the three-point method, honor the pessimistic |
| Scope Creep | Estimate stays fixed while scope grows | Re-estimate when scope changes, always |
| Hofstadter's Law | "It always takes longer, even when you account for it" | Add buffer, then add more buffer for novel work |
| Dunning-Kruger | Novices underestimate; experts sometimes overestimate | Cross-check with a second estimator |
| Sunk Cost Pressure | Refusing to re-estimate because the original was "approved" | Treat estimates as living artifacts, update often |

---

## Estimation by Task Type

Use these ranges as starting heuristics, then adjust with multipliers and historical data.

| Task Type | Typical Range | Key Variables |
|---------------------|------------------|------------------------------------------------|
| Bug fix (isolated) | 2-8 hours | Reproducibility, code familiarity, test coverage |
| Bug fix (systemic) | 1-3 days | Root cause depth, blast radius, regression risk |
| Small feature | 1-3 days | Spec clarity, UI complexity, number of endpoints |
| Medium feature | 3-10 days | Cross-cutting concerns, data model changes |
| Large feature | 2-4 weeks | Architecture decisions, team coordination |
| Refactor (local) | 1-3 days | Test coverage, coupling, blast radius |
| Refactor (systemic) | 1-4 weeks | Number of callers, migration strategy needed |
| Spike / research | 2-8 hours (timeboxed) | Always timebox — output is knowledge, not code |
| DevOps / infra | 1-5 days | Provider docs quality, IAM complexity, testing |

---

## Communication

How you present an estimate matters as much as the number itself.

**Always present as a range, never a single number:**

- Bad: "It'll take 5 days."
- Good: "3-7 days, most likely 5. The range depends on the payment API response format — I'll know more after the spike."

**Confidence levels:**

| Confidence | What It Means | When to Use |
|------------|--------------------------------------------|------------------------------------|
| High (+-15%) | Well-understood scope, done similar before | Familiar task, clear spec |
| Medium (+-30%) | Some unknowns, reasonable decomposition | Most sprint-level estimates |
| Low (+-50%+) | Significant unknowns, rough order of magnitude | Roadmap forecasts, presale quotes |

**Stakeholder communication rules:**

1. State the range and the confidence level together
2. Name the top 1-3 risks that could push toward the upper bound
3. Offer to de-risk with a timeboxed spike before committing
4. Explicitly state what is **not** included (e.g., "does not include QA, deployment, or docs")
5. Update estimates proactively when new information surfaces — don't wait until the deadline

---

## Anti-Patterns

| Anti-Pattern | Why It's Harmful | Better Approach |
|------------------------|-----------------------------------------------------|----------------------------------------------|
| Padding silently | Erodes trust when discovered; hides real uncertainty | Use explicit buffers with stated rationale |
| Sandbagging | Destroys velocity data; breeds complacency | Track accuracy ratio, aim for calibration |
| Not decomposing | Large estimates hide unknowns and compound errors | Break to < 4-hour sub-tasks, estimate bottom-up |
| Single-point estimates | Implies false certainty, no room for variance | Always give a range with confidence level |
| Estimating under pressure | Anchoring to what the stakeholder wants to hear | Ask for time to decompose; never estimate on the spot |
| Copy-paste estimates | Every task has different context and risk profile | Estimate fresh, use references as starting points only |
| Ignoring rework cycles | First pass is rarely final — reviews, feedback, QA | Factor in at least one review-and-revise loop |

---

## NEVER Do

1. **NEVER give a single-number estimate without a range** — it communicates false precision and sets you up for failure
2. **NEVER estimate a task you haven't decomposed** — large estimates are guesses wearing a suit
3. **NEVER let an old estimate stand after scope changes** — estimates are invalidated the moment requirements shift
4. **NEVER estimate in someone else's units** — your days are not their days; clarify assumptions about focus time and interrupts
5. **NEVER skip recording actuals** — estimation without feedback is astrology, not engineering
6. **NEVER commit to an estimate made under pressure** — say "let me break this down and get back to you in an hour"
7. **NEVER treat an estimate as a promise or a deadline** — estimates are probabilistic forecasts, not contracts

Related Skills

testing-patterns

7
from wpank/ai

Unit, integration, and E2E testing patterns with framework-specific guidance. Use when asked to "write tests", "add test coverage", "testing strategy", "test this function", "create test suite", "fix flaky tests", or "improve test quality".

e2e-testing-patterns

7
from wpank/ai

Build reliable, fast E2E test suites with Playwright and Cypress. Critical user journey coverage, flaky test elimination, CI/CD integration.

websocket-hub-patterns

7
from wpank/ai

Horizontally-scalable WebSocket hub pattern with lazy Redis subscriptions, connection registry, and graceful shutdown. Use when building real-time WebSocket servers that scale across multiple instances. Triggers on WebSocket hub, WebSocket scaling, connection registry, Redis WebSocket, real-time gateway, horizontal scaling.

workflow-patterns

7
from wpank/ai

Systematic task implementation using TDD, phase checkpoints, and structured commits. Ensures quality through red-green-refactor cycles, 80% coverage gates, and verification protocols before proceeding.

10x-patterns

7
from wpank/ai

Patterns and practices that dramatically accelerate development velocity. Covers parallel execution, automation, feedback loops, workflow optimization, and anti-pattern avoidance. Use when starting projects, planning sprints, optimizing workflows, or onboarding developers.

react-composition-patterns

7
from wpank/ai

No description provided.

loading-state-patterns

7
from wpank/ai

Patterns for skeleton loaders, shimmer effects, and loading states that match design system aesthetics. Covers skeleton components, shimmer animations, and progressive loading. Use when building polished loading experiences. Triggers on skeleton, loading state, shimmer, placeholder, loading animation.

design-system-patterns

7
from wpank/ai

Foundational design system architecture — token hierarchies, theming infrastructure, token pipelines, and governance. Use when creating design tokens, implementing theme switching, setting up Style Dictionary, or establishing multi-brand theming. Triggers on design tokens, theme provider, Style Dictionary, token pipeline, multi-brand theming, CSS custom properties architecture.

nodejs-patterns

7
from wpank/ai

WHAT: Production-ready Node.js backend patterns - Express/Fastify setup, layered architecture, middleware, error handling, validation, database integration, authentication, and caching. WHEN: User is building REST APIs, setting up Node.js servers, implementing authentication, integrating databases, adding validation/caching, or structuring backend applications. KEYWORDS: nodejs, node, express, fastify, typescript, api, rest, middleware, authentication, jwt, validation, zod, postgres, mongodb, redis, caching, rate limiting, error handling

microservices-patterns

7
from wpank/ai

No description provided.

architecture-patterns

7
from wpank/ai

No description provided.

auth-patterns

7
from wpank/ai

Authentication and authorization patterns — JWT, OAuth 2.0, sessions, RBAC/ABAC, password security, MFA, and vulnerability prevention. Use when implementing login flows, protecting routes, managing tokens, or auditing auth security.