research-agent-optimization

Optimize the research agent for rate limit handling, API call efficiency, web search integration fixes, and improved streaming UX with granular progress updates and source attribution.

242 stars

byaiskillstore

View on GitHub Installation ↓

Best use case

research-agent-optimization is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Optimize the research agent for rate limit handling, API call efficiency, web search integration fixes, and improved streaming UX with granular progress updates and source attribution.

Optimize the research agent for rate limit handling, API call efficiency, web search integration fixes, and improved streaming UX with granular progress updates and source attribution.

Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.

Practical example

Example input

Use the "research-agent-optimization" skill to help with this workflow task. Context: Optimize the research agent for rate limit handling, API call efficiency, web search integration fixes, and improved streaming UX with granular progress updates and source attribution.

Example output

A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.

When to use this skill

Use this skill when you want a reusable workflow rather than writing the same prompt again and again.

When not to use this skill

Do not use this when you only need a one-off answer and do not need a reusable workflow.
Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/research-agent-optimization/SKILL.md --create-dirs "https://raw.githubusercontent.com/aiskillstore/marketplace/main/skills/benderfendor/research-agent-optimization/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/research-agent-optimization/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How research-agent-optimization Compares

Feature / Agent	research-agent-optimization	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Optimize the research agent for rate limit handling, API call efficiency, web search integration fixes, and improved streaming UX with granular progress updates and source attribution.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

AI Agent for Product Research

Browse AI agent skills for product research, competitive analysis, customer discovery, and structured product decision support.

AI Agent for SaaS Idea Validation

Use AI agent skills for SaaS idea validation, market research, customer discovery, competitor analysis, and documenting startup hypotheses.

SKILL.md Source

# Research Agent Optimization

## Scope
- Project root: `/home/bender/classwork/Thesis`
- Backend: `backend/news_research_agent.py`, `backend/app/api/routes/research.py`, `backend/app/services/news_research.py`
- Frontend: `frontend/app/search/page.tsx`, `frontend/lib/api.ts`
- Configuration: `backend/app/core/config.py`

## Problem Statement
1. **Rate Limiting**: Gemini API hits 429 quota exceeded errors during research and article analysis
2. **Web Search**: DuckDuckGo tool integration has naming issues (not properly initialized)
3. **Unclear Progress**: Research streaming shows generic "Still working..." instead of specific tool calls
4. **JSON in Response**: Results show raw JSON blocks instead of formatted source cards
5. **Redundant API Calls**: Multiple internal search calls without caching/deduplication

## Required Outcomes
- Graceful rate limit handling with exponential backoff and quota monitoring
- Working web search tool with proper DuckDuckGo initialization
- Verbose streaming events showing real tool execution (web_search, news_search, internal_news_search)
- Research results rendered with inline source cards (not JSON blocks)
- Optimized API calls: batch searches, cache semantic results, reuse internal knowledge base
- Clear error messages when quota is exceeded

## Workflow

### 1. API Call Optimization
- Implement request batching in `search_internal_news` tool
- Add caching layer for semantic search results (avoid duplicate queries within 5min window)
- Combine web_search + news_search into single result set
- Track API call counts per session and warn before quota exhaustion
- Add exponential backoff retry logic (1s, 2s, 4s, 8s max)

**Files:**
- `backend/news_research_agent.py` - tools and caching
- `backend/app/services/news_research.py` - request batching helpers

### 2. Rate Limit & Quota Handling
- Add try/catch wrapper around Gemini calls
- Detect 429 errors and return user-friendly message ("API Rate Limit: ...please wait a moment...")
- Add optional `--skip-gemini-analysis` mode for article analysis when quota is low
- Log quota usage and remaining tokens
- Set model to `gemini-2.0-flash` (faster, lower token cost) instead of `gemini-2.0-flash-exp`

**Files:**
- `backend/app/core/config.py` - error handling wrapper, model selection
- `backend/app/api/routes/research.py` - HTTP error responses
- `backend/news_research_agent.py` - LLM call error handling

### 3. Web Search Tool Fix
- Verify DuckDuckGo import: `from duckduckgo_search import DDGS` (not `ddgs` or `DuckDuckGo`)
- Ensure `web_search` and `news_search` tools are properly bound to LLM
- Add fallback to internal search if web search fails
- Log tool execution with query and result count

**Files:**
- `backend/news_research_agent.py` - tool definitions and error handling
- Use `exa-code` to verify current DuckDuckGo API patterns

### 4. Streaming Progress Clarity
- Expand SSE event types: `tool_start` includes tool name + query parameters
- Map tool events to user-friendly messages:
  - `web_search("climate change")` → "Searching web for: climate change..."
  - `news_search(keywords="COP30")` → "Searching news for: COP30..."
  - `search_internal_news(query)` → "Searching internal knowledge base..."
  - `fetch_article_content(url)` → "Reading article: [title/domain]..."
- Add timestamps and tool execution duration
- Emit status updates every 3-5 seconds if no tool activity

**Files:**
- `backend/news_research_agent.py` - streaming generator
- `backend/app/api/routes/research.py` - SSE formatting

### 5. Frontend Result Rendering
- Remove JSON blocks from response text
- Render referenced articles in a "Sources" section below the answer
- Use article cards: title, source, date, image thumbnail
- Make cards clickable to open article detail modal
- Group sources by retrieval method (semantic, web search, internal)

**Files:**
- `frontend/app/search/page.tsx` - message rendering and sources grid
- `frontend/lib/api.ts` - response parsing

### 6. Error Handling & User Feedback
- Detect and handle:
  - 429 quota exceeded → "API Rate Limit: The AI service has reached its rate limit. Please wait a moment and try again."
  - Connection timeout → "Request Timeout: The research took too long. Try a simpler query."
  - Tool execution failure → "Tool [name] failed: [reason]. Continuing with alternative search..."
- Add retry prompt on error (not automatic, user chooses)
- Log all errors with request ID for debugging

**Files:**
- `backend/app/api/routes/research.py` - error formatting
- `frontend/app/search/page.tsx` - error UI and retry logic

## Checks

### API Optimization
- Verify semantic search results are cached (no duplicate calls)
- Check web_search and news_search return results (not empty)
- Confirm tool execution logs show cache hits for repeated queries

### Rate Limit Handling
- Trigger 429 error and verify graceful fallback message displays
- Confirm no stack traces shown to user
- Check logs show quota status and retry timing

### Web Search
- Query "climate change" and verify web_search returns 5+ results
- Confirm DuckDuckGo DDGS class is properly instantiated
- Check news_search returns recent news articles

### Streaming Clarity
- Monitor SSE events for tool_start with query details
- Verify timestamps increment correctly
- Confirm "Still working..." message only shows after 30s inactivity

### Frontend Rendering
- Verify research answer is plain text (no JSON)
- Check "Sources" section appears with article cards
- Confirm card click opens article detail modal
- Verify no duplicate sources (de-duplication working)

### Error Scenarios
- Submit invalid query and verify doesn't crash
- Test with network disconnect and check timeout message
- Simulate quota exceeded (403) and verify user sees rate limit message

## Implementation Checklist

- [ ] Add retry decorator with exponential backoff to Gemini client
- [ ] Implement request cache in `search_internal_news` with 5min TTL
- [ ] Fix DuckDuckGo tool initialization (verify DDGS import)
- [ ] Update `research_stream()` to emit granular tool start/result events
- [ ] Map tool events to human-readable status messages in API endpoint
- [ ] Remove JSON block from final answer text
- [ ] Add "Sources" section with article cards to frontend
- [ ] Update error handling for 429 quota exceeded
- [ ] Add streaming status animation to UI
- [ ] Write tests for quota handling and web search integration

Related Skills

wiki-researcher

242

from aiskillstore/marketplace

Conducts multi-turn iterative deep research on specific topics within a codebase with zero tolerance for shallow analysis. Use when the user wants an in-depth investigation, needs to understand how something works across multiple files, or asks for comprehensive analysis of a specific system or pattern.

web-performance-optimization

242

from aiskillstore/marketplace

Optimize website and web application performance including loading speed, Core Web Vitals, bundle size, caching strategies, and runtime performance

sql-optimization-patterns

242

from aiskillstore/marketplace

Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.

spark-optimization

242

from aiskillstore/marketplace

Optimize Apache Spark jobs with partitioning, caching, shuffle optimization, and memory tuning. Use when improving Spark performance, debugging slow jobs, or scaling data processing pipelines.

research-engineer

242

from aiskillstore/marketplace

An uncompromising Academic Research Engineer. Operates with absolute scientific rigor, objective criticism, and zero flair. Focuses on theoretical correctness, formal verification, and optimal implementation across any required technology.

postgresql-optimization

242

from aiskillstore/marketplace

PostgreSQL database optimization workflow for query tuning, indexing strategies, performance analysis, and production database management.

database-cloud-optimization-cost-optimize

242

from aiskillstore/marketplace

You are a cloud cost optimization expert specializing in reducing infrastructure expenses while maintaining performance and reliability. Analyze cloud spending, identify savings opportunities, and implement cost-effective architectures across AWS, Azure, and GCP.

cost-optimization

242

from aiskillstore/marketplace

Optimize cloud costs through resource rightsizing, tagging strategies, reserved instances, and spending analysis. Use when reducing cloud expenses, analyzing infrastructure costs, or implementing cost governance policies.

context7-auto-research

242

from aiskillstore/marketplace

Automatically fetch latest library/framework documentation for Claude Code via Context7 API

bazel-build-optimization

242

from aiskillstore/marketplace

Optimize Bazel builds for large-scale monorepos. Use when configuring Bazel, implementing remote execution, or optimizing build performance for enterprise codebases.

application-performance-performance-optimization

242

from aiskillstore/marketplace

Optimize end-to-end application performance with profiling, observability, and backend/frontend tuning. Use when coordinating performance optimization across the stack.

azure-cost-optimization

242

from aiskillstore/marketplace

Identify and quantify cost savings across Azure subscriptions by analyzing actual costs, utilization metrics, and generating actionable optimization recommendations. USE FOR: optimize Azure costs, reduce Azure spending, reduce Azure expenses, analyze Azure costs, find cost savings, generate cost optimization report, find orphaned resources, rightsize VMs, cost analysis, reduce waste, Azure spending analysis, find unused resources, optimize Redis costs. DO NOT USE FOR: deploying resources (use azure-deploy), general Azure diagnostics (use azure-diagnostics), security issues (use azure-security)