telegram-scraper-run

Automatic Telegram scraping

33 stars

Best use case

telegram-scraper-run is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Automatic Telegram scraping

Teams using telegram-scraper-run should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/telegram-scraper-run/SKILL.md --create-dirs "https://raw.githubusercontent.com/aAAaqwq/AGI-Super-Team/main/skills/telegram-scraper-run/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/telegram-scraper-run/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How telegram-scraper-run Compares

Feature / Agenttelegram-scraper-runStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Automatic Telegram scraping

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Telegram Scraper Run

> Runs the Telegram Scraper Agent manually for testing or unscheduled scanning.

## When to use

- "run telegram scraper"
- "scan telegram channels"
- "find new channels with AI"
- "check competitors on Telegram"

## Input

Optional:
- `--dry-run` - test run without notifications
- `--category <name>` - scan only one category (competitors/industry/advertising)
- `--no-messages` - skip reading messages (faster)
- `--notify-test` - notification test only

## How to execute

### Full run (production)

```bash
cd $AGENTS_PATH/telegram-scraper
python3 telegram_scraper_agent.py
```

### Dry-run tests

```bash
# Test without notifications
python3 telegram_scraper_agent.py --dry-run

# Single category
python3 telegram_scraper_agent.py --category competitors --dry-run

# Without reading messages (faster)
python3 telegram_scraper_agent.py --no-messages --dry-run
```

### Notification test

```bash
python3 telegram_scraper_agent.py --notify-test
```

### Unit tests

```bash
python3 test_telegram_scraper.py
```

## Output

Agent outputs:
1. Progress to stderr (channel scanning)
2. Summary to stdout (results from Claude)
3. Telegram notification (if high-value channels found)

Data is saved to:
```
$PROJECT_ROOT/data/telegram_scraper/
├── YYYY-MM-DD/           # Dated results
│   ├── competitors_channels.json
│   ├── competitors_ad_contacts.csv
│   ├── industry_channels.json
│   ├── advertising_channels.json
│   └── messages/
└── latest/               # Symlinks to most recent
```

## Checking results

```bash
# Latest results
ls -l $PROJECT_ROOT/data/telegram_scraper/latest/

# Top 5 channels (competitors)
cat $PROJECT_ROOT/data/telegram_scraper/latest/competitors_channels.json | jq '.[0:5]'

# Ad contacts
cat $PROJECT_ROOT/data/telegram_scraper/latest/competitors_ad_contacts.csv

# Agent log
cat $PROJECT_ROOT/data/telegram_scraper/agent_log.json | jq '.[-5:]'
```

## Configuration

Edit config:
```bash
code $PROJECT_ROOT/data/telegram_scraper_config.json
```

Config structure:
```json
{
  "categories": {
    "competitors": {
      "keywords": ["annotation", "data labeling", "cvat"],
      "exclude": ["spam", "crypto"],
      "scan_posts": 10
    }
  },
  "min_subscribers": 100,
  "min_score": 10,
  "notification_threshold": 30
}
```

## Launchd Schedule

Agent runs automatically twice daily (9:00, 18:00).

```bash
# Check status
launchctl list | grep telegram-scraper

# Load schedule
launchctl load ~/Library/LaunchAgents/com.yourcompany.telegram-scraper.plist

# Unload schedule
launchctl unload ~/Library/LaunchAgents/com.yourcompany.telegram-scraper.plist

# View logs
tail -f $GOOGLE_TOOLS_PATH/logs/telegram_scraper.log
tail -f $GOOGLE_TOOLS_PATH/logs/telegram_scraper.err
```

## Troubleshooting

### Session Expired Error

If Telegram session is invalid:
```bash
# Refresh session
cd $TG_TOOLS_PATH
python3 -m tg_utils.auth
```

### No Results

- Check keywords in config (too specific?)
- Verify session: `cd $TG_TOOLS_PATH && python3 -m tg_utils.auth`
- Run with `--dry-run` for debug

### Rate Limited

- Normal: agent waits and retries
- FloodWaitError > 5 min: channel skipped
- Solution: decrease `scan_posts` in config

## Manual Scraping (without the agent)

If the agent is not working:
```bash
cd $TG_TOOLS_PATH/tools

# Find channels with ad contacts
python3 tg_scrape.py ads --keywords "annotation,labeling" --posts 10

# List channels
python3 tg_scrape.py channels --keywords "ai,ml" --output channels.csv

# Read messages
python3 tg_scrape.py messages "Channel Name" --days 7 --limit 50
```

## Next steps

After scraping:

1. **Add contacts to CRM**: use `add-lead` skill
2. **Write outreach**: use `telegram-send` skill
3. **Adjust config**: edit config file and re-run

## Related skills

- `telegram-session` - update Telegram session
- `add-lead` - add found contacts to CRM
- `telegram-send` - message ad contacts
- `daily-briefing` - include findings in morning briefing

Related Skills

telegram-session

33
from aAAaqwq/AGI-Super-Team

Create/update Telethon session

telegram-send

33
from aAAaqwq/AGI-Super-Team

Telegram DM sending from CSV, rate limiting, idempotency

telegram-scrape

33
from aAAaqwq/AGI-Super-Team

Search Telegram channels, read posts, ad contacts

telegram-push

33
from aAAaqwq/AGI-Super-Team

通过独立 Telegram Bot 向群聊或私聊推送消息,适合不依赖 OpenClaw channel 配置的通知场景。

telegram-inbound-run

33
from aAAaqwq/AGI-Super-Team

Automatic inbound Telegram message processing

telegram-groups

33
from aAAaqwq/AGI-Super-Team

Posting, members, Telegram group management

telegram-contacts

33
from aAAaqwq/AGI-Super-Team

Export/import/lookup Telegram contacts

telegram-check

33
from aAAaqwq/AGI-Super-Team

Check inbound Telegram messages

telegram-automation

33
from aAAaqwq/AGI-Super-Team

Automate Telegram tasks via Rube MCP (Composio): send messages, manage chats, share photos/documents, and handle bot commands. Always search tools first for current schemas.

image-scraper

33
from aAAaqwq/AGI-Super-Team

Scrape and download all images from a given URL. Takes a URL, extracts image URLs from the page, and downloads them. Uses python3/curl as primary method, falls back to browser automation if needed. Use when user provides a URL and wants to download images from that page.

apify-ultimate-scraper

33
from aAAaqwq/AGI-Super-Team

Universal AI-powered web scraper for any platform. Scrape data from Instagram, Facebook, TikTok, YouTube, LinkedIn, X/Twitter, Google Maps, Google Search, Google Trends, Reddit, Airbnb, Yelp, and 15+ more platforms. Use for lead generation, brand monitoring, competitor analysis, influencer discovery, trend research, content analytics, audience analysis, review analysis, SEO intelligence, recruitment, or any data extraction task.

wemp-operator

33
from aAAaqwq/AGI-Super-Team

> 微信公众号全功能运营——草稿/发布/评论/用户/素材/群发/统计/菜单/二维码 API 封装

Content & Documentation