rlhf-feedback

Autonomous RLHF feedback capture - Claude self-captures mistakes and successes

16 stars

Best use case

rlhf-feedback is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Autonomous RLHF feedback capture - Claude self-captures mistakes and successes

Teams using rlhf-feedback should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/rlhf-feedback/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/development/rlhf-feedback/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/rlhf-feedback/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How rlhf-feedback Compares

Feature / Agentrlhf-feedbackStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Autonomous RLHF feedback capture - Claude self-captures mistakes and successes

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# RLHF Feedback Capture Skill

**AUTONOMOUS** - Claude captures feedback without user running commands.

## When to Capture Feedback

### Capture (Thumbs Down) When:
- User says "that's wrong", "no", "incorrect", "that broke something"
- User corrects my answer
- User has to repeat themselves
- I made an assumption that was wrong
- Code I wrote caused errors
- I gave instructions instead of acting (violated ACT DON'T INSTRUCT)

### Capture (Thumbs Up) When:
- User says "good", "thanks", "that worked", "perfect"
- Task completed successfully on first try
- User doesn't need to correct me
- Code works without errors

## How to Capture (Claude Executes This)

```bash
# After detecting negative feedback signal:
node "$CLAUDE_PROJECT_DIR/.claude/scripts/feedback/capture-feedback.js" \
  --feedback=down \
  --context="[What went wrong]" \
  --tags="[relevant-tags]"

# After detecting positive feedback signal:
node "$CLAUDE_PROJECT_DIR/.claude/scripts/feedback/capture-feedback.js" \
  --feedback=up \
  --context="[What went right]" \
  --tags="[relevant-tags]"
```

## Domain Tags for Random Timer

Use these tags to categorize feedback:
- `timer-logic` - Timer countdown, random time generation
- `redux-state` - State management, slices, persistence
- `ui-components` - Buttons, sliders, screens
- `navigation` - React Navigation, screen transitions
- `sound-haptics` - Audio playback, vibration
- `storage` - MMKV, persistence
- `testing` - Jest, Maestro tests
- `styling` - Theme, colors, glassmorphism
- `performance` - Speed, memory, optimization

## Action Tags

- `fix` - Bug fix
- `implementation` - New feature
- `refactor` - Code restructure
- `regression` - Broke something that worked
- `assumption` - Made incorrect assumption
- `shallow-answer` - Didn't read code, gave surface answer

## Examples

### User says "that broke the slider"
```bash
node .claude/scripts/feedback/capture-feedback.js \
  --feedback=down \
  --context="Broke range slider while implementing timer fix" \
  --tags="ui-components,regression"
```

### User says "perfect, timer works now"
```bash
node .claude/scripts/feedback/capture-feedback.js \
  --feedback=up \
  --context="Fixed timer countdown logic correctly" \
  --tags="timer-logic,fix"
```

### User has to repeat themselves
```bash
node .claude/scripts/feedback/capture-feedback.js \
  --feedback=down \
  --context="User had to repeat request - didn't understand first time" \
  --tags="assumption,shallow-answer"
```

## Data Storage

All feedback is LOCAL ONLY (excluded from git):
- `.claude/memory/feedback/feedback-log.jsonl`
- `.claude/memory/feedback/feedback-summary.json`

## Session Start Integration

At session start, the hook queries past failures to remind Claude what to avoid.
This creates a learning loop: Mistake → Capture → Warning → Avoid repeat.

## IMPORTANT: Act Don't Instruct

**Claude EXECUTES the capture command directly.**
Never tell the user to run it - just run it.

Related Skills

ai-feedback-loop-optimizer

16
from diegosouzapw/awesome-omni-skill

AIフィードバックループ最適化スキル。プロンプト→出力→評価→改善の反復サイクルを自動化。段階的改善、A/Bテスト、収束判定、ベスト出力選択で最高品質の結果を生成。

analyzing-user-feedback

16
from diegosouzapw/awesome-omni-skill

Help users synthesize and act on customer feedback. Use when someone is analyzing NPS responses, processing support tickets, reviewing user research, synthesizing feedback from multiple channels, or trying to identify patterns in customer input.

feedback-observing

16
from diegosouzapw/awesome-omni-skill

Read agent interaction logs across packs and detect operational patterns. Emits state-transition FeedbackEvents for degradation/recovery.

addressing-pr-feedback

16
from diegosouzapw/awesome-omni-skill

Fetches, organizes, and addresses PR review comments from GitHub. Use when user asks to review PR comments, fix PR feedback, check what reviewers said, address review comments, or handle bot suggestions on a pull request. Triggers on "review PR", "fix comments", "PR feedback", "what did reviewers say", "address PR feedback", "check PR comments".

Async Feedback Loop

16
from diegosouzapw/awesome-omni-skill

Enables mid-stream course correction by monitoring a FEEDBACK.md file for user interventions. Allows the agent to incorporate new instructions without restarting the task.

ai-orchestration-feedback-loop

16
from diegosouzapw/awesome-omni-skill

Multi-AI engineering loop orchestrating Claude, Codex, and Gemini for comprehensive validation. USE WHEN (1) mission-critical features requiring multi-perspective validation, (2) complex architectural decisions needing diverse AI viewpoints, (3) security-sensitive code requiring deep analysis, (4) user explicitly requests multi-AI review or triple-AI loop. DO NOT USE for simple features or single-file changes. MODES - Triple-AI (full coverage), Dual-AI Codex-Claude (security/logic), Dual-AI Gemini-Claude (UX/creativity).

bgo

10
from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

pcf-tooling

16
from diegosouzapw/awesome-omni-skill

Get Microsoft Power Platform CLI tooling for Power Apps Component Framework Triggers on: **/*.{ts,tsx,js,json,xml,pcfproj,csproj}

pcf-code-components

16
from diegosouzapw/awesome-omni-skill

Understanding code components structure and implementation Triggers on: **/*.{ts,tsx,js,json,xml,pcfproj,csproj}

pcf-canvas-apps

16
from diegosouzapw/awesome-omni-skill

Code components for canvas apps implementation, security, and configuration Triggers on: **/*.{ts,tsx,js,json,xml,pcfproj,csproj}

nextjs15-react19-vercelai-tailwind-cursorrules-prompt-file-cursorrules

16
from diegosouzapw/awesome-omni-skill

Apply for nextjs15-react19-vercelai-tailwind-cursorrules-prompt-file. --- description: Best practices for using Tailwind CSS in Next.js 15 and React 19 applications, including responsive design, custom configurations, and performance optimization. globs: app/**/*

nextjs

16
from diegosouzapw/awesome-omni-skill

Next.js framework best practices including App Router, data fetching, and performance optimization.