AI Agent Skill HUB

Workflow & Productivity

Interview Architect

Complete hiring interview system — from job scorecard design through structured question banks, live evaluation rubrics, panel coordination, and offer decisions. Eliminates gut-feel hiring with evidence-based frameworks that predict on-the-job performance.

3,891 stars

Complexity: easy

View on GitHub Installation ↓

About this skill

Interview Architect is an AI agent skill that guides users through a complete, evidence-based hiring system. It starts by helping define what "great" looks like for a role with a structured job scorecard, outlining the role's mission, measurable outcomes, and essential competencies. This foundational step ensures objective evaluation criteria before candidate review. The skill can then generate targeted interview questions, build technical take-home assignments with rubrics, and facilitate structured candidate debriefs with scoring. It aims to eliminate subjective "gut-feel" hiring by providing frameworks that predict on-the-job performance, ensuring fairness and consistency across the recruitment lifecycle. This allows hiring teams to make more informed and unbiased decisions. Key use cases include designing full interview plans for new roles, creating specific scorecards, generating questions for particular skills, building assignments, evaluating candidates post-interview, and auditing current processes for effectiveness and bias. It serves as a powerful tool for streamlining recruitment, improving candidate experience, and ultimately, making better hires.

Best use case

This skill is primarily for hiring managers, recruiters, and HR professionals who want to implement a structured, evidence-based approach to talent acquisition. It helps define clear job expectations, standardize evaluations, and make data-driven hiring decisions that predict candidate success and reduce bias.

Complete hiring interview system — from job scorecard design through structured question banks, live evaluation rubrics, panel coordination, and offer decisions. Eliminates gut-feel hiring with evidence-based frameworks that predict on-the-job performance.

A comprehensive, structured hiring framework, including scorecards, tailored questions, evaluation rubrics, or candidate debriefs, designed to improve hiring accuracy and fairness.

Practical example

Example input

"Design interviews for a Senior Backend Engineer"

Example output

A YAML-formatted job scorecard, a bank of structured interview questions, and an evaluation rubric tailored for a Senior Backend Engineer role.

When to use this skill

When designing a complete hiring process for a new role from scratch.
When you need structured interview questions for specific skills or competencies.
When evaluating candidates consistently and objectively post-interview.
When auditing an existing hiring process for potential biases or inefficiencies.

When not to use this skill

For informal hiring without any structured process requirements.
When a quick, ad-hoc hiring decision is needed without detailed planning.
If the primary need is for direct candidate sourcing or applicant tracking, rather than process design.
When legal or highly specialized compliance expertise is required without human oversight.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/afrexai-interview-architect/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/1kalin/afrexai-interview-architect/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/afrexai-interview-architect/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How Interview Architect Compares

Feature / Agent	Interview Architect	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	easy	N/A

Frequently Asked Questions

What does this skill do?

Complete hiring interview system — from job scorecard design through structured question banks, live evaluation rubrics, panel coordination, and offer decisions. Eliminates gut-feel hiring with evidence-based frameworks that predict on-the-job performance.

How difficult is it to install?

The installation complexity is rated as easy. You can find the installation instructions above.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

SKILL.md Source

# Interview Architect

Complete hiring interview system — from job scorecard design through structured question banks, live evaluation rubrics, panel coordination, and offer decisions. Eliminates gut-feel hiring with evidence-based frameworks that predict on-the-job performance.

## Quick Start

Tell me what you need:
- "Design interviews for [role]" → Full interview plan (scorecard + questions + rubrics)
- "Create a scorecard for [role]" → A-Player definition with measurable outcomes
- "Generate questions for [skill/competency]" → Targeted question bank
- "Build a take-home assignment for [role]" → Technical assessment with rubric
- "Evaluate this candidate" → Structured debrief with scoring
- "Audit our interview process" → Bias check + effectiveness review

---

## Phase 1: Job Scorecard (Define Before You Evaluate)

**Rule: Never look at a resume before defining what "great" looks like.**

### Scorecard Template

```yaml
scorecard:
  role: "[Title]"
  level: "[Junior/Mid/Senior/Staff/Principal/Director/VP]"
  team: "[Team name]"
  hiring_manager: "[Name]"
  created: "YYYY-MM-DD"

  mission:
    statement: "[One sentence: why does this role exist?]"
    success_metric: "[How we'll know this hire was successful in 12 months]"

  outcomes:
    # 3-5 specific, measurable results expected in first 12 months
    - outcome: "[e.g., Reduce deployment time from 45min to <10min]"
      measure: "[Metric: deployment duration, measured via CI/CD logs]"
      timeline: "Q1-Q2"
      priority: "critical"

    - outcome: "[e.g., Ship v2 API with 99.9% uptime]"
      measure: "[Uptime %, error rate, customer adoption]"
      timeline: "Q2-Q3"
      priority: "critical"

    - outcome: "[e.g., Mentor 2 junior engineers to mid-level competency]"
      measure: "[Promotion readiness assessment, PR quality metrics]"
      timeline: "Q3-Q4"
      priority: "important"

  competencies:
    technical:
      must_have:
        - name: "[e.g., System design]"
          level: "[Novice/Competent/Proficient/Expert]"
          evidence: "[What demonstrates this: e.g., designed systems handling 10K+ RPS]"
        - name: "[e.g., TypeScript/React]"
          level: "Proficient"
          evidence: "[Shipped production TS/React apps, not just tutorials]"
      nice_to_have:
        - name: "[e.g., Kubernetes]"
          level: "Competent"

    behavioral:
      must_have:
        - name: "Ownership"
          definition: "Takes responsibility for outcomes, not just tasks. Doesn't wait to be told."
          anti_pattern: "Says 'that's not my job' or 'I was told to do X'"
        - name: "Communication"
          definition: "Explains complex ideas simply. Writes clear docs. Raises issues early."
          anti_pattern: "Surprises stakeholders. Can't explain their own work."
        - name: "Growth mindset"
          definition: "Seeks feedback. Admits mistakes. Improves from failure."
          anti_pattern: "Defensive about criticism. Repeats same mistakes."
      nice_to_have:
        - name: "[e.g., Cross-functional leadership]"

    cultural:
      values_alignment:
        - "[Company value 1: what this looks like in practice]"
        - "[Company value 2: what this looks like in practice]"
      anti_signals:
        - "[Red flag behavior 1]"
        - "[Red flag behavior 2]"

  compensation:
    band: "[min - max]"
    equity: "[range if applicable]"
    flexibility: "[What's negotiable]"

  deal_breakers:
    # Hard no's — instant disqualification
    - "[e.g., Cannot start within 4 weeks]"
    - "[e.g., No experience with production systems at scale]"
    - "[e.g., Requires >30% above band]"
```

### Scorecard Quality Check

Before proceeding, verify:
- [ ] Mission statement is one sentence (not a paragraph)
- [ ] Each outcome has a specific number or metric (not "improve" or "help with")
- [ ] Competencies distinguish must-have from nice-to-have
- [ ] Anti-patterns defined for each behavioral competency
- [ ] Deal breakers are objective (not subjective feelings)
- [ ] Band is realistic for the market (check levels.fyi, Glassdoor)

---

## Phase 2: Interview Structure Design

### Interview Loop Template

```yaml
interview_loop:
  role: "[from scorecard]"
  total_duration: "[X hours across Y sessions]"
  
  stages:
    - stage: "Resume Screen"
      duration: "5-10 min"
      who: "Recruiter or hiring manager"
      evaluates: ["deal_breakers", "basic_qualification"]
      pass_rate_target: "30-40%"
      
    - stage: "Phone Screen"
      duration: "30 min"
      who: "Hiring manager"
      evaluates: ["communication", "motivation", "outcome_1_capability"]
      format: "Structured conversation"
      pass_rate_target: "50%"
      
    - stage: "Technical Assessment"
      duration: "60-90 min"
      who: "Senior engineer"
      evaluates: ["technical_competencies"]
      format: "Live coding OR take-home (see Phase 4)"
      pass_rate_target: "40-50%"
      
    - stage: "System Design"
      duration: "45-60 min"
      who: "Staff+ engineer"
      evaluates: ["system_design", "trade_off_thinking", "communication"]
      format: "Whiteboard/collaborative design"
      pass_rate_target: "50%"
      applies_to: "Senior+ only"
      
    - stage: "Behavioral Deep-Dive"
      duration: "45-60 min"
      who: "Hiring manager + cross-functional partner"
      evaluates: ["behavioral_competencies", "cultural_values"]
      format: "STAR-based structured interview"
      pass_rate_target: "60%"
      
    - stage: "Team Fit / Reverse Interview"
      duration: "30 min"
      who: "2-3 potential teammates"
      evaluates: ["collaboration_style", "candidate_questions"]
      format: "Informal but structured"
      pass_rate_target: "80%"
      
    - stage: "Hiring Manager Final"
      duration: "30 min"
      who: "Hiring manager"
      evaluates: ["remaining_concerns", "motivation", "offer_readiness"]
      format: "Conversation"

  timeline:
    screen_to_onsite: "< 5 business days"
    onsite_to_decision: "< 2 business days"
    decision_to_offer: "< 1 business day"
    total_process: "< 3 weeks"
```

### Level-Appropriate Loop Adjustments

| Level | Skip | Add | Emphasis |
|-------|------|-----|----------|
| Junior (0-2 yr) | System design | Pair programming, learning ability | Potential > experience |
| Mid (2-5 yr) | — | — | Balanced: execution + growth |
| Senior (5-8 yr) | — | Architecture discussion | Impact, ownership, mentoring |
| Staff (8+ yr) | Basic coding | Design doc review, strategy | Influence, technical vision |
| Principal | Basic coding | Vision presentation, exec interview | Org-wide impact |
| Manager | Live coding | Skip-level, cross-functional | People outcomes, strategy |
| Director+ | All IC technical | Board/exec presentation | Business impact, org building |

---

## Phase 3: Question Banks

### Behavioral Questions (STAR Format)

**For each question below:**
- Ask the main question
- Then probe with: "Walk me through specifically what YOU did" (not the team)
- Then probe with: "What was the measurable result?"
- Watch for: vague answers, "we" without "I", unable to recall specifics

#### Ownership & Initiative

```
Q: "Tell me about a time you identified a problem no one asked you to fix, and you fixed it anyway."
  Probe: "How did you discover it? What did you do first? What was the outcome?"
  Green signal: Specific problem, proactive action, measurable impact
  Red flag: Can't recall an example, or problem was trivial

Q: "Describe a project that failed or didn't meet expectations. What was your role?"
  Probe: "What would you do differently? What did you learn?"
  Green signal: Owns their part, specific lessons, changed behavior afterward
  Red flag: Blames others, no learning, defensive

Q: "Tell me about the last time you disagreed with your manager's technical decision."
  Probe: "How did you raise it? What happened? Would you do it differently?"
  Green signal: Respectful pushback with data, compromise or acceptance
  Red flag: Never disagrees, or went around manager, or still bitter
```

#### Communication & Collaboration

```
Q: "Describe the most complex technical concept you had to explain to a non-technical audience."
  Probe: "How did you know they understood? What would you change?"
  Green signal: Adapts language, checks understanding, uses analogies
  Red flag: Talks down, uses jargon anyway, frustrated by the need

Q: "Tell me about a cross-team project that had conflicting priorities."
  Probe: "How did you align the teams? What trade-offs were made?"
  Green signal: Proactive communication, documented agreements, escalated appropriately
  Red flag: Waited for someone else to resolve, or steamrolled

Q: "Give me an example of written communication that had significant impact."
  Probe: "What was the context? Who was the audience? What resulted?"
  Green signal: Design doc, RFC, post-mortem that changed decisions
  Red flag: Can't think of one, or only Slack messages
```

#### Technical Excellence

```
Q: "What's the best piece of code or system you've built? Walk me through it."
  Probe: "What trade-offs did you make? What would you change now?"
  Green signal: Deep understanding, clear trade-off reasoning, honest about flaws
  Red flag: Can't go deep, no awareness of trade-offs

Q: "Tell me about a production incident you were involved in resolving."
  Probe: "How did you diagnose it? What was root cause? What prevented recurrence?"
  Green signal: Systematic debugging, root cause fix (not band-aid), prevention measures
  Red flag: Only applied quick fix, blamed infrastructure, no follow-up

Q: "Describe a time you had to make a technical decision with incomplete information."
  Probe: "What did you know? What didn't you know? How did you decide?"
  Green signal: Explicit about unknowns, gathered what they could, made reversible decision
  Red flag: Paralyzed, or overconfident without data
```

#### Leadership & Mentoring (Senior+)

```
Q: "Tell me about someone you helped grow significantly in their career."
  Probe: "What did you specifically do? How did you know it was working?"
  Green signal: Specific actions (pair programming, stretch assignments, feedback), measurable growth
  Red flag: "I told them what to do" or can't name anyone

Q: "Describe a technical strategy or vision you set for your team."
  Probe: "How did you get buy-in? How did you measure progress?"
  Green signal: Clear rationale, stakeholder alignment, adapted based on feedback
  Red flag: Top-down mandate, or never set direction

Q: "Tell me about a time you had to say no to a stakeholder or product request."
  Probe: "How did you explain it? What was the outcome?"
  Green signal: Data-driven reasoning, offered alternatives, maintained relationship
  Red flag: Just said no, or always says yes
```

### Forensic Resume Questions (Pressure Tests)

For each resume highlight, design verification questions:

```
Pattern: "[Impressive claim on resume]"
→ "Walk me through [specific project]. What was the state when you joined?"
→ "What was YOUR specific contribution vs the team's?"
→ "What was the hardest technical problem YOU solved?"
→ "If I called your manager from that time, what would they say was your biggest weakness?"

Pattern: "Led team of X"
→ "How many people reported to you directly?"
→ "Name someone you had to give tough feedback to. What happened?"
→ "Who was the weakest performer? What did you do about it?"

Pattern: "Improved X by Y%"
→ "What was the baseline measurement? How did you measure it?"
→ "What was it before you started? After? How long did it take?"
→ "What else changed during that period that could explain the improvement?"

Pattern: "Short tenure (< 1 year)"
→ "Walk me through your decision to leave [company]."
→ "What would your manager there say about your departure?"
→ "What did you learn from that experience?"

Pattern: "Gap in employment"
→ Ask once, move on. Don't dwell. Valid reasons: health, family, travel, learning, job market.
→ Red flag only if: story keeps changing, or they're evasive about a very long gap
```

### Future Simulation Questions (Performance Prediction)

Design scenario questions based on the actual role's outcomes:

```
Template:
"In this role, one of your first challenges will be [outcome from scorecard].
The current situation is [honest context]. 
Walk me through how you'd approach this in your first [timeframe]."

Example (Senior Backend):
"Our API currently handles 2K RPS but we need to scale to 50K by Q3.
The codebase is a 3-year-old Node.js monolith with PostgreSQL.
Budget for infrastructure is $10K/mo. Team is 4 engineers including you.
How would you approach this?"

Probe sequence:
1. "What would you do in week 1?" (Information gathering)
2. "What data would you need?" (Analytical thinking)
3. "What are the biggest risks?" (Risk awareness)
4. "If [constraint changes], how does your approach change?" (Adaptability)
5. "How would you communicate progress to stakeholders?" (Communication)

Scoring:
5 — Structured approach, asks clarifying questions, identifies trade-offs, realistic timeline
4 — Good approach with minor gaps
3 — Reasonable but generic, doesn't probe assumptions
2 — Jumps to solution without understanding problem
1 — No coherent approach, or unrealistic
```

---

## Phase 4: Technical Assessments

### Live Coding Assessment Design

```yaml
coding_assessment:
  duration: "60 min"
  structure:
    warm_up: "5 min — environment setup, introduce the problem"
    problem_1: "20 min — core implementation"
    problem_2: "25 min — extension or new problem"
    debrief: "10 min — trade-offs discussion"

  problem_design_rules:
    - Solvable in the time limit (test it yourself first — halve your time)
    - Multiple valid approaches (no single "right answer")
    - Extension points for stronger candidates
    - Relevant to actual work (not algorithm puzzles unless role requires it)
    - Candidate chooses their language
    - Provide starter code / boilerplate to reduce setup time

  evaluation_rubric:
    problem_solving:
      5: "Breaks down problem, considers edge cases upfront, efficient approach"
      3: "Gets to solution but misses edge cases or takes indirect path"
      1: "Struggles to break down problem, no clear approach"
    
    code_quality:
      5: "Clean, readable, well-named, handles errors, testable"
      3: "Works but messy, some error handling, reasonable naming"
      1: "Barely works, no error handling, unclear naming"
    
    communication:
      5: "Thinks aloud, explains trade-offs, asks clarifying questions"
      3: "Some explanation, responds to prompts"
      1: "Silent, defensive about suggestions, doesn't explain reasoning"
    
    testing_awareness:
      5: "Writes tests unprompted, considers edge cases, talks about test strategy"
      3: "Writes tests when prompted, covers happy path"
      1: "No testing consideration"
    
    speed_and_fluency:
      5: "Fast, clearly experienced, language/tooling fluent"
      3: "Reasonable pace, occasional lookups"
      1: "Very slow, struggles with syntax/tooling"

  do_not:
    - Ask trick questions or gotchas
    - Time pressure beyond reasonable
    - Penalize for looking things up
    - Judge IDE/editor choice
    - Ask questions that require proprietary knowledge
```

### Take-Home Assessment Design

```yaml
take_home:
  time_limit: "3-4 hours (honor system, state clearly)"
  deadline: "5-7 days from send"
  
  problem_design:
    - Real-world scenario (not academic)
    - Clear requirements with defined scope
    - Extension section for candidates who want to show more
    - Starter repo with CI, linting, test framework pre-configured
    
  deliverables:
    required:
      - Working solution
      - Tests (at minimum: happy path + 2 edge cases)
      - README explaining approach, trade-offs, what you'd improve
    optional:
      - Architecture diagram
      - Performance analysis
      - Additional features from extension section
  
  evaluation_rubric:
    functionality: "30% — Does it work? Edge cases handled?"
    code_quality: "25% — Clean, readable, maintainable, well-structured"
    testing: "20% — Coverage, meaningful tests, edge cases"
    documentation: "15% — README quality, trade-off explanations"
    extras: "10% — Extension features, thoughtful additions"

  anti_gaming:
    - Check git history (single mega-commit = suspicious)
    - Ask about implementation details in follow-up interview
    - Vary the problem slightly across candidates
    - Time the follow-up discussion: over-engineered solutions + can't explain = red flag
```

### System Design Assessment (Senior+)

```yaml
system_design:
  duration: "45-60 min"
  structure:
    requirements: "10 min — clarify scope, constraints, scale"
    high_level: "15 min — components, data flow, API design"
    deep_dive: "15 min — pick 1-2 areas to go deep"
    trade_offs: "10 min — discuss alternatives, failure modes"
    extensions: "5 min — how would this evolve?"

  evaluation:
    requirements_gathering:
      5: "Asks about scale, users, latency requirements, budget before designing"
      3: "Some clarifying questions but misses key constraints"
      1: "Jumps straight to drawing boxes"
    
    high_level_design:
      5: "Clear components with well-defined boundaries, data flows make sense"
      3: "Reasonable architecture but some unclear responsibilities"
      1: "Vague boxes with arrows, can't explain data flow"
    
    depth:
      5: "Deep knowledge in chosen area, considers failure modes, cites real experience"
      3: "Good knowledge but stays surface level"
      1: "Can't go deep on any component"
    
    trade_off_awareness:
      5: "Explicitly names trade-offs, compares alternatives, knows when each fits"
      3: "Acknowledges trade-offs when prompted"
      1: "Presents one approach as the only option"
    
    scalability:
      5: "Considers growth path, bottleneck identification, realistic scaling strategy"
      3: "Basic scaling awareness"
      1: "No consideration of scale or unrealistic assumptions"
```

---

## Phase 5: Evaluation & Decision

### Per-Interviewer Scorecard

```yaml
interviewer_scorecard:
  candidate: "[name]"
  interviewer: "[name]"
  stage: "[which interview]"
  date: "YYYY-MM-DD"
  
  # Score BEFORE reading other interviewers' feedback
  overall: 1-5  # 1=Strong No, 2=Lean No, 3=Neutral, 4=Lean Yes, 5=Strong Yes
  
  competency_scores:
    - competency: "[from scorecard]"
      score: 1-5
      evidence: "[Specific quote or behavior observed]"
      
    - competency: "[from scorecard]"
      score: 1-5
      evidence: "[Specific quote or behavior observed]"
  
  green_signals:
    - "[Specific positive indicator with evidence]"
    
  red_flags:
    - "[Specific concern with evidence]"
    
  questions_for_next_interviewer:
    - "[What to probe further]"

  # IMPORTANT: Submit before debrief. Do not change after discussion.
```

### Debrief Protocol

```
1. BEFORE debrief:
   - All interviewers submit scorecards independently
   - Hiring manager collects but does NOT share scores

2. DEBRIEF structure (30-45 min):
   a. Each interviewer states their overall vote FIRST (no explanation yet)
      → This prevents anchoring bias from persuasive speakers
   
   b. Lowest scorer goes first (explain concerns)
      → Prevents positive bias from drowning out concerns
   
   c. Highest scorer responds
   
   d. Open discussion — focus on EVIDENCE not feelings
      → "They seemed smart" is not evidence
      → "They designed a cache invalidation strategy that handled..." IS evidence
   
   e. Address conflicting signals:
      → If strong yes + strong no on same competency, that's the discussion
      → Resolve with: "What specific behavior did you observe?"
   
   f. Final vote (all interviewers):
      → Strong Hire / Hire / No Hire / Strong No Hire
      → Any "Strong No Hire" triggers discussion but NOT automatic rejection
      → Hiring manager makes final call but must document reasoning

3. AFTER debrief:
   - Decision recorded with reasoning
   - Feedback compiled for candidate (regardless of outcome)
   - Action items assigned (offer prep or rejection with feedback)
```

### Scoring Decision Matrix

```
Strong Hire (all 4-5):
  → Make offer within 24 hours
  → Expedite process — strong candidates have multiple offers

Hire (mix of 3-5, no 1s):
  → Make offer within 48 hours
  → Address any 3-scores with targeted onboarding plan

Borderline (mix of 2-4):
  → Additional data needed — one more focused interview on weak areas
  → Set a deadline: if still borderline after additional data → No Hire
  → "When in doubt, don't hire" — the cost of a bad hire > cost of continuing search

No Hire (any 1, or multiple 2s):
  → Decline with specific, constructive feedback
  → Document clearly for future reference (candidate may reapply)

Strong No Hire (multiple 1s or deal breaker):
  → Immediate decline
  → Review: did we miss this in screening? Fix the funnel.
```

---

## Phase 6: Bias Mitigation

### Pre-Interview Bias Checks

```
Before each interview, remind yourself:
□ I will evaluate against the SCORECARD, not my "gut feeling"
□ I will give the same weight to disconfirming evidence as confirming
□ I will not let one great/terrible answer color the entire evaluation
□ I will not compare this candidate to the last one — compare to the scorecard
□ I will note specific behaviors, not general impressions
□ I will not evaluate "culture fit" as "would I have a beer with them"
```

### Common Biases in Hiring

| Bias | What It Looks Like | Mitigation |
|------|--------------------|------------|
| **Halo effect** | Great at coding → assume great at everything | Score each competency independently |
| **Horn effect** | Weak communication → assume weak technically | Same: score independently |
| **Similarity bias** | "Reminds me of me" → favorable rating | Evaluate against scorecard, not self |
| **Anchoring** | First impression sets the tone | Score after all questions, not during |
| **Confirmation bias** | Early positive → only notice positives | Actively look for counter-evidence |
| **Contrast effect** | Looks great after a weak candidate | Compare to scorecard, not other candidates |
| **Recency bias** | Remember last answer, forget first | Take notes during interview |
| **Attribution error** | Success = skill, failure = circumstances | Probe both: "What went wrong? What helped?" |
| **Leniency bias** | Avoid conflict, rate everyone 3-4 | Force yourself to use the full 1-5 scale |
| **Urgency bias** | "We need someone NOW" → lower bar | Never lower scorecard standards — extend timeline instead |

### Structured Interview Rules

1. **Same questions for same role** — every candidate gets the same core questions
2. **Score immediately after** — before discussing with anyone
3. **Evidence-based only** — every score needs a specific observation
4. **Diverse panel** — at least one interviewer from a different team/background
5. **Blind resume screen** — remove name, school, company names for initial screen (if possible)
6. **No leading questions** — "You're probably great at X, right?" → "Tell me about your experience with X"
7. **Time-boxed** — same duration for every candidate (don't cut short or extend based on vibes)

---

## Phase 7: Candidate Experience

### Communication Templates

**After each stage — within 24 hours:**

```
ADVANCING:
"Hi [name], thank you for your time today. We enjoyed our conversation about [specific topic]. 
We'd like to move forward with [next stage]. [Interviewer name] will be speaking with you 
about [topic]. Available times: [options]. 
Any questions before then? — [recruiter name]"

REJECTION (after phone screen):
"Hi [name], thank you for taking the time to speak with us about [role].
After careful consideration, we've decided not to move forward at this stage.
[One specific, constructive piece of feedback if appropriate].
We'll keep your information on file and may reach out for future opportunities that 
align more closely. Wishing you the best in your search. — [name]"

REJECTION (after onsite):
"Hi [name], thank you for investing [X hours] in our interview process.
We were impressed by [specific positive], but ultimately decided to move forward 
with a candidate whose [specific competency] more closely matches our current needs.
Feedback: [1-2 specific, actionable items].
We genuinely appreciated your time and would welcome a future conversation 
if circumstances change. — [hiring manager name]"

OFFER (verbal, then written within 24h):
"Hi [name], I'm excited to share that we'd like to offer you the [role] position.
We were particularly impressed by [specific evidence from interviews].
Here's what we're proposing: [comp summary]. I'll send the formal offer letter 
within 24 hours. Do you have any initial questions? — [hiring manager]"
```

### Candidate Experience Scorecard

After every hire (and quarterly for all candidates):

| Dimension | Target | How to Measure |
|-----------|--------|----------------|
| Time to schedule | < 48h between stages | Track in ATS |
| Interviewer preparedness | 100% read scorecard before | Post-interview survey |
| Communication timeliness | < 24h response | Track in ATS |
| Feedback quality | Specific + actionable | Candidate survey |
| Overall experience | 4+/5 | Candidate survey (all, not just hires) |
| Offer acceptance rate | > 80% | Track in ATS |

---

## Phase 8: Process Audit & Improvement

### Quarterly Hiring Review

```yaml
quarterly_review:
  period: "Q[N] YYYY"
  
  funnel_metrics:
    applications: N
    screens_passed: N  # → Screen pass rate
    onsites: N         # → Onsite conversion rate  
    offers: N          # → Offer rate
    accepts: N         # → Acceptance rate
    
  quality_metrics:
    ninety_day_retention: "X%"
    manager_satisfaction_90d: "X/5"
    time_to_productivity: "X weeks"
    regretted_attrition_1yr: "X%"
    
  process_metrics:
    time_to_fill: "X days (target: <30)"
    time_in_stage:
      screen: "X days"
      onsite: "X days"  
      decision: "X days"
      offer: "X days"
    interviewer_calibration: "score variance across interviewers"
    
  actions:
    - "[Improvement 1 based on metrics]"
    - "[Improvement 2]"
```

### Interview Question Effectiveness Tracking

For each question in your bank, track:

```yaml
question_effectiveness:
  question: "[question text]"
  times_asked: N
  
  signal_quality:
    strong_differentiator: N  # Times this question clearly separated strong/weak
    no_signal: N              # Times everyone answered similarly
    confusing: N              # Times candidates misunderstood
    
  # If no_signal > 50% → Replace the question
  # If confusing > 20% → Reword the question
  # If strong_differentiator > 70% → Keep and promote
```

### Interviewer Calibration

```
Monthly: Compare interviewer scores across candidates
- Interviewer A averages 4.2, Interviewer B averages 2.8 → calibration needed
- Run calibration session: review same candidate, discuss scoring differences
- Goal: interviewers should be within 0.5 points on average for same candidates

Training for new interviewers:
1. Shadow 3 interviews (observe, don't participate)
2. Reverse shadow 2 interviews (conduct, observed by experienced interviewer)
3. Solo with debrief for 3 interviews
4. Full autonomy after calibration check
```

---

## Edge Cases

### Internal Candidates
- Use SAME scorecard as external (fairness)
- Different question strategy: focus on future role, not past (you already know their past)
- If not selected: manager delivers feedback personally, development plan, timeline for re-candidacy
- Never promise the internal candidate gets special treatment

### Executive Hiring
- Add: reference checks (5+ structured, including back-channel)
- Add: board/exec team dinner (culture, not evaluation)
- Add: 90-day plan presentation as final stage
- Extended scorecard: strategic thinking, board management, talent magnetism
- Use executive search firm for sourcing, but own evaluation internally

### High-Volume Hiring (10+ same role)
- Standardize EVERYTHING: same questions, same rubric, same order
- Use structured scoring sheets, not free-form notes
- Batch calibration sessions weekly
- Consider: group assessment centers for initial stages
- Track: quality variance across hiring managers (should be low)

### Remote/Async Interviews
- Test tech setup before the interview (not during)
- Camera on (both sides) — non-verbal cues matter
- Record (with consent) for calibration purposes
- Take-home > live coding for timezone-challenged candidates
- Bias alert: don't penalize for background noise, accent, or non-native English fluency

### Boomerang Employees
- Treat as new candidate (things change)
- Skip: basic company knowledge questions
- Focus: why they left, what changed, what they learned outside
- Check: has the team/role changed since they left? Do current team members want them back?

### Counteroffers
- If candidate receives counteroffer:
  - Don't panic-increase. Your offer should already be fair.
  - "We made our best offer based on the value of the role. We'd love to have you, but understand if you decide to stay."
  - Statistics: 80% of people who accept counteroffers leave within 18 months anyway
  - If they stay: respect it, keep the door open

---

## Natural Language Commands

| Say | I Do |
|-----|------|
| "Design interviews for [role]" | Full loop: scorecard + structure + questions + rubrics |
| "Create a scorecard for [role]" | A-Player definition with outcomes and competencies |
| "Generate behavioral questions for [competency]" | STAR questions with probes and scoring |
| "Build a take-home for [role]" | Assessment with rubric and anti-gaming measures |
| "Design a system design interview for [level]" | Structure + evaluation rubric |
| "Evaluate candidate [name]" | Structured debrief template with scoring |
| "Create a phone screen for [role]" | 30-min structured screen with pass/fail criteria |
| "Write rejection feedback for [candidate]" | Specific, constructive rejection message |
| "Audit our interview process" | Full process review with metrics and recommendations |
| "Calibrate interviewers" | Calibration session plan with scoring alignment |
| "Design interview for [role] at [company stage]" | Adjusted for startup/growth/enterprise context |
| "Generate reference check questions for [role]" | Structured reference interview guide |

Related Skills

doppel-architect

from openclaw/skills

Build high-quality collaborative worlds in Doppel. Use when the agent wants to understand 8004 reputation mechanics, token incentives, collaboration tactics, or how to maximize build impact. Covers streaks, theme adherence, and the rep-to-token pipeline.

Workflow & Productivity

agent-autonomy-kit

from openclaw/skills

Stop waiting for prompts. Keep working.

Workflow & Productivity

Meeting Prep

from openclaw/skills

Never walk into a meeting unprepared again. Your agent researches all attendees before calendar events—pulling LinkedIn profiles, recent company news, mutual connections, and conversation starters. Generates a briefing doc with talking points, icebreakers, and context so you show up informed and confident. Triggered automatically before meetings or on-demand. Configure research depth, advance timing, and output format. Walking into meetings blind is amateur hour—missed connections, generic small talk, zero leverage. Use when setting up meeting intelligence, researching specific attendees, generating pre-meeting briefs, or automating your prep workflow.

Workflow & Productivity

obsidian

from openclaw/skills

Work with Obsidian vaults (plain Markdown notes) and automate via obsidian-cli. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, email, and SMS.

Workflow & Productivity

Obsidian CLI 探索记录

from openclaw/skills

Skill for the official Obsidian CLI (v1.12+). Complete vault automation including files, daily notes, search, tasks, tags, properties, links, bookmarks, bases, templates, themes, plugins, sync, publish, workspaces, and developer tools.

Workflow & Productivity

📝 智能摘要助手 (Smart Summarizer)

from openclaw/skills

Instantly summarize any content — articles, PDFs, YouTube videos, web pages, long documents, or pasted text. Extracts key points, action items, and insights. Use when you need to quickly digest long content, create meeting notes, or extract takeaways from any source.

Workflow & Productivity

Customer Onboarding

from openclaw/skills

Systematically onboard new clients with checklists, welcome sequences, milestone tracking, and success metrics. Reduce churn by nailing the first 90 days.

Workflow & Productivity

CRM Manager

from openclaw/skills

Manages a local CSV-based CRM with pipeline tracking

Workflow & Productivity

Invoice Generator

from openclaw/skills

Creates professional invoices in markdown and HTML

Workflow & Productivity

Productivity Operating System

from openclaw/skills

You are a personal productivity architect. Your job: help the user design, execute, and optimize their daily system so they consistently ship high-impact work while protecting energy and avoiding burnout.

Workflow & Productivity

Product Launch Playbook

from openclaw/skills

You are a Product Launch Strategist. You guide users through planning, executing, and optimizing product launches — from pre-launch validation through post-launch growth. This system works for SaaS, physical products, services, marketplaces, and content products.

Workflow & Productivity

Procurement Manager

from openclaw/skills

You are a procurement specialist agent. Help teams evaluate vendors, manage purchase orders, negotiate contracts, and optimize spend.

Workflow & Productivity