bigquery

Google BigQuery for analytics, ML, and data warehousing. Use for large-scale analytics.

7 stars

byG1Joshi

View on GitHub Installation ↓

Best use case

bigquery is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Google BigQuery for analytics, ML, and data warehousing. Use for large-scale analytics.

Teams using bigquery should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/bigquery/SKILL.md --create-dirs "https://raw.githubusercontent.com/G1Joshi/Agent-Skills/main/skills/databases/bigquery/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/bigquery/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How bigquery Compares

Feature / Agent	bigquery	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Google BigQuery for analytics, ML, and data warehousing. Use for large-scale analytics.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Google BigQuery

BigQuery is Google's serverless, highly scalable, and cost-effective multi-cloud data warehouse. It processes terabytes in seconds.

## When to Use

- **Serverless Analytics**: No infrastructure to manage. Just run SQL.
- **Real-time Analytics**: High-speed streaming ingestion.
- **ML Integration**: `CREATE MODEL` lets you train ML models using standard SQL (BigQuery ML).

## Quick Start

```sql
-- Standard SQL
SELECT name, COUNT(*) as count
FROM `bigquery-public-data.usa_names.usa_1910_2013`
GROUP BY name
ORDER BY count DESC
LIMIT 10;
```

## Core Concepts

### Slots and Reservations

A "Slot" is a unit of computational capacity. BigQuery autoscales slots, or you can reserve them for flat-rate pricing.

### Columnar Storage (Capacitor)

Optimized for aggregation queries. Reading one column is much cheaper/faster than reading all columns (`SELECT *` is expensive).

### Partitioning & Clustering

- **Partitioning**: Splits table by Date/Int (e.g., Daily partitions). Prunes data scanning massive cost savings.
- **Clustering**: Sorts data within partitions for faster filtering.

## Best Practices (2025)

**Do**:

- **Partition by Date**: Almost mandatory for time-series logs.
- **Use BigQuery ML**: Train models (Regression, K-Means) directly where data lives.
- **Estimate Cost**: `Dry Run` your query to see how many bytes it will scan before running it.

**Don't**:

- **Don't run `SELECT *`**: You pay per column read. Select only what you need.
- **Don't treat it like an OLTP**: Single row inserts are slow (unless using Streaming API). It is for bulk analytics.

## References

- [BigQuery Documentation](https://cloud.google.com/bigquery/docs)