rlhf-feedback

Autonomous RLHF feedback capture - Claude self-captures mistakes and successes

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

rlhf-feedback is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Autonomous RLHF feedback capture - Claude self-captures mistakes and successes

Teams using rlhf-feedback should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/rlhf-feedback/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/development/rlhf-feedback/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/rlhf-feedback/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How rlhf-feedback Compares

Feature / Agent	rlhf-feedback	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Autonomous RLHF feedback capture - Claude self-captures mistakes and successes

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# RLHF Feedback Capture Skill

**AUTONOMOUS** - Claude captures feedback without user running commands.

## When to Capture Feedback

### Capture (Thumbs Down) When:
- User says "that's wrong", "no", "incorrect", "that broke something"
- User corrects my answer
- User has to repeat themselves
- I made an assumption that was wrong
- Code I wrote caused errors
- I gave instructions instead of acting (violated ACT DON'T INSTRUCT)

### Capture (Thumbs Up) When:
- User says "good", "thanks", "that worked", "perfect"
- Task completed successfully on first try
- User doesn't need to correct me
- Code works without errors

## How to Capture (Claude Executes This)

```bash
# After detecting negative feedback signal:
node "$CLAUDE_PROJECT_DIR/.claude/scripts/feedback/capture-feedback.js" \
  --feedback=down \
  --context="[What went wrong]" \
  --tags="[relevant-tags]"

# After detecting positive feedback signal:
node "$CLAUDE_PROJECT_DIR/.claude/scripts/feedback/capture-feedback.js" \
  --feedback=up \
  --context="[What went right]" \
  --tags="[relevant-tags]"
```

## Domain Tags for Random Timer

Use these tags to categorize feedback:
- `timer-logic` - Timer countdown, random time generation
- `redux-state` - State management, slices, persistence
- `ui-components` - Buttons, sliders, screens
- `navigation` - React Navigation, screen transitions
- `sound-haptics` - Audio playback, vibration
- `storage` - MMKV, persistence
- `testing` - Jest, Maestro tests
- `styling` - Theme, colors, glassmorphism
- `performance` - Speed, memory, optimization

## Action Tags

- `fix` - Bug fix
- `implementation` - New feature
- `refactor` - Code restructure
- `regression` - Broke something that worked
- `assumption` - Made incorrect assumption
- `shallow-answer` - Didn't read code, gave surface answer

## Examples

### User says "that broke the slider"
```bash
node .claude/scripts/feedback/capture-feedback.js \
  --feedback=down \
  --context="Broke range slider while implementing timer fix" \
  --tags="ui-components,regression"
```

### User says "perfect, timer works now"
```bash
node .claude/scripts/feedback/capture-feedback.js \
  --feedback=up \
  --context="Fixed timer countdown logic correctly" \
  --tags="timer-logic,fix"
```

### User has to repeat themselves
```bash
node .claude/scripts/feedback/capture-feedback.js \
  --feedback=down \
  --context="User had to repeat request - didn't understand first time" \
  --tags="assumption,shallow-answer"
```

## Data Storage

All feedback is LOCAL ONLY (excluded from git):
- `.claude/memory/feedback/feedback-log.jsonl`
- `.claude/memory/feedback/feedback-summary.json`

## Session Start Integration

At session start, the hook queries past failures to remind Claude what to avoid.
This creates a learning loop: Mistake → Capture → Warning → Avoid repeat.

## IMPORTANT: Act Don't Instruct

**Claude EXECUTES the capture command directly.**
Never tell the user to run it - just run it.

Related Skills

ai-feedback-loop-optimizer

from diegosouzapw/awesome-omni-skill

AIフィードバックループ最適化スキル。プロンプト→出力→評価→改善の反復サイクルを自動化。段階的改善、A/Bテスト、収束判定、ベスト出力選択で最高品質の結果を生成。

analyzing-user-feedback

from diegosouzapw/awesome-omni-skill

Help users synthesize and act on customer feedback. Use when someone is analyzing NPS responses, processing support tickets, reviewing user research, synthesizing feedback from multiple channels, or trying to identify patterns in customer input.

feedback-observing

from diegosouzapw/awesome-omni-skill

Read agent interaction logs across packs and detect operational patterns. Emits state-transition FeedbackEvents for degradation/recovery.

addressing-pr-feedback

from diegosouzapw/awesome-omni-skill

Fetches, organizes, and addresses PR review comments from GitHub. Use when user asks to review PR comments, fix PR feedback, check what reviewers said, address review comments, or handle bot suggestions on a pull request. Triggers on "review PR", "fix comments", "PR feedback", "what did reviewers say", "address PR feedback", "check PR comments".

Async Feedback Loop

from diegosouzapw/awesome-omni-skill

Enables mid-stream course correction by monitoring a FEEDBACK.md file for user interventions. Allows the agent to incorporate new instructions without restarting the task.

ai-orchestration-feedback-loop

from diegosouzapw/awesome-omni-skill

Multi-AI engineering loop orchestrating Claude, Codex, and Gemini for comprehensive validation. USE WHEN (1) mission-critical features requiring multi-perspective validation, (2) complex architectural decisions needing diverse AI viewpoints, (3) security-sensitive code requiring deep analysis, (4) user explicitly requests multi-AI review or triple-AI loop. DO NOT USE for simple features or single-file changes. MODES - Triple-AI (full coverage), Dual-AI Codex-Claude (security/logic), Dual-AI Gemini-Claude (UX/creativity).

bgo

from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

pcf-tooling

from diegosouzapw/awesome-omni-skill

Get Microsoft Power Platform CLI tooling for Power Apps Component Framework Triggers on: **/*.{ts,tsx,js,json,xml,pcfproj,csproj}

pcf-code-components

from diegosouzapw/awesome-omni-skill

Understanding code components structure and implementation Triggers on: **/*.{ts,tsx,js,json,xml,pcfproj,csproj}

pcf-canvas-apps

from diegosouzapw/awesome-omni-skill

Code components for canvas apps implementation, security, and configuration Triggers on: **/*.{ts,tsx,js,json,xml,pcfproj,csproj}

nextjs15-react19-vercelai-tailwind-cursorrules-prompt-file-cursorrules

from diegosouzapw/awesome-omni-skill

Apply for nextjs15-react19-vercelai-tailwind-cursorrules-prompt-file. --- description: Best practices for using Tailwind CSS in Next.js 15 and React 19 applications, including responsive design, custom configurations, and performance optimization. globs: app/**/*

nextjs

from diegosouzapw/awesome-omni-skill

Next.js framework best practices including App Router, data fetching, and performance optimization.