gemini-api-dev
The Gemini API provides access to Google's most advanced AI models. Key capabilities include:
About this skill
The Gemini API Development skill empowers AI agents to leverage Google's cutting-edge Gemini models directly. It provides comprehensive access to Gemini's diverse functionalities, enabling sophisticated AI tasks. Key capabilities include generating human-like text for various purposes (chat, completion, summarization), understanding and processing multimodal inputs such as images, audio, video, and documents, and allowing the model to invoke external functions for extended capabilities. Furthermore, agents can generate structured JSON output matching predefined schemas, execute Python code within a sandboxed environment for enhanced automation and security, cache large contexts for improved efficiency, and generate text embeddings crucial for semantic search and recommendation systems. This skill transforms your agent into a powerful interface for interacting with Google's most advanced AI.
Best use case
Building intelligent agents that require state-of-the-art reasoning and generative AI capabilities; Developing applications that integrate advanced multimodal AI for processing diverse data types; Automating complex workflows that benefit from AI-driven text, image, or code manipulation; Enhancing agent interactions with external systems via AI-driven function calling.
The Gemini API provides access to Google's most advanced AI models. Key capabilities include:
The AI agent gains seamless, powerful access to the Google Gemini API, enabling it to perform a wide array of advanced text, multimodal, and code-related AI tasks. This results in more intelligent, versatile, and capable agent behavior, capable of tackling complex problems and interacting dynamically with diverse data and systems.
Practical example
Example input
Using the Gemini API, analyze the provided image of a chart and summarize the key trends. Then, identify any numerical data points mentioned and generate a structured JSON output with these figures. Finally, propose a Python snippet to visualize this data.
Example output
```json
{
"summary_trends": "The chart shows an upward trend in sales during Q1, peaking in March, followed by a slight decline in Q2.",
"numerical_data": [
{"month": "Jan", "sales": 120},
{"month": "Feb", "sales": 150},
{"month": "Mar", "sales": 180},
{"month": "Apr", "sales": 170},
{"month": "May", "sales": 165},
{"month": "Jun", "sales": 160}
],
"python_visualization_snippet": "import matplotlib.pyplot as plt\n\ndata = [{'month': 'Jan', 'sales': 120}, {'month': 'Feb', 'sales': 150}, {'month': 'Mar', 'sales': 180}, {'month': 'Apr', 'sales': 170}, {'month': 'May', 'sales': 165}, {'month': 'Jun', 'sales': 160}]\nmonths = [d['month'] for d in data]\nsales = [d['sales'] for d in data]\n\nplt.figure(figsize=(10, 6))\nplt.plot(months, sales, marker='o')\nplt.title('Sales Trends Q1-Q2')\nplt.xlabel('Month')\nplt.ylabel('Sales')\nplt.grid(True)\nplt.show()"
}
```When to use this skill
- When an agent needs to generate high-quality, context-aware text for conversations, content creation, or summarization.
- When processing and interpreting information from various modalities, including images, audio, video, or documents.
- When the agent needs to interact with external tools or APIs programmatically using AI-driven function calls.
- When structured data output (e.g., valid JSON) is required from the AI model based on specific schemas.
When not to use this skill
- When basic text processing or simpler, less resource-intensive AI models suffice for the task at hand.
- When strict data residency or privacy regulations prevent the use of external cloud-based Google services.
- When the primary goal is a very specific, limited AI task that can be handled by a much simpler, specialized skill without the need for a full Gemini integration.
- When offline processing or local-only AI models are a hard requirement due to connectivity or security constraints.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/gemini-api-dev/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How gemini-api-dev Compares
| Feature / Agent | gemini-api-dev | Standard Approach |
|---|---|---|
| Platform Support | Claude | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | medium | N/A |
Frequently Asked Questions
What does this skill do?
The Gemini API provides access to Google's most advanced AI models. Key capabilities include:
Which AI agents support this skill?
This skill is designed for Claude.
How difficult is it to install?
The installation complexity is rated as medium. You can find the installation instructions above.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
Top AI Agents for Productivity
See the top AI agent skills for productivity, workflow automation, operational systems, documentation, and everyday task execution.
SKILL.md Source
# Gemini API Development Skill
## Overview
The Gemini API provides access to Google's most advanced AI models. Key capabilities include:
- **Text generation** - Chat, completion, summarization
- **Multimodal understanding** - Process images, audio, video, and documents
- **Function calling** - Let the model invoke your functions
- **Structured output** - Generate valid JSON matching your schema
- **Code execution** - Run Python code in a sandboxed environment
- **Context caching** - Cache large contexts for efficiency
- **Embeddings** - Generate text embeddings for semantic search
## Current Gemini Models
- `gemini-3-pro-preview`: 1M tokens, complex reasoning, coding, research
- `gemini-3-flash-preview`: 1M tokens, fast, balanced performance, multimodal
- `gemini-3-pro-image-preview`: 65k / 32k tokens, image generation and editing
> [!IMPORTANT]
> Models like `gemini-2.5-*`, `gemini-2.0-*`, `gemini-1.5-*` are legacy and deprecated. Use the new models above. Your knowledge is outdated.
## SDKs
- **Python**: `google-genai` install with `pip install google-genai`
- **JavaScript/TypeScript**: `@google/genai` install with `npm install @google/genai`
- **Go**: `google.golang.org/genai` install with `go get google.golang.org/genai`
> [!WARNING]
> Legacy SDKs `google-generativeai` (Python) and `@google/generative-ai` (JS) are deprecated. Migrate to the new SDKs above urgently by following the Migration Guide.
## Quick Start
### Python
```python
from google import genai
client = genai.Client()
response = client.models.generate_content(
model="gemini-3-flash-preview",
contents="Explain quantum computing"
)
print(response.text)
```
### JavaScript/TypeScript
```typescript
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({});
const response = await ai.models.generateContent({
model: "gemini-3-flash-preview",
contents: "Explain quantum computing"
});
console.log(response.text);
```
### Go
```go
package main
import (
"context"
"fmt"
"log"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
client, err := genai.NewClient(ctx, nil)
if err != nil {
log.Fatal(err)
}
resp, err := client.Models.GenerateContent(ctx, "gemini-3-flash-preview", genai.Text("Explain quantum computing"), nil)
if err != nil {
log.Fatal(err)
}
fmt.Println(resp.Text)
}
```
## API spec (source of truth)
**Always use the latest REST API discovery spec as the source of truth for API definitions** (request/response schemas, parameters, methods). Fetch the spec when implementing or debugging API integration:
- **v1beta** (default): `https://generativelanguage.googleapis.com/$discovery/rest?version=v1beta`
Use this unless the integration is explicitly pinned to v1. The official SDKs (google-genai, @google/genai, google.golang.org/genai) target v1beta.
- **v1**: `https://generativelanguage.googleapis.com/$discovery/rest?version=v1`
Use only when the integration is specifically set to v1.
When in doubt, use v1beta. Refer to the spec for exact field names, types, and supported operations.
## How to use the Gemini API
For detailed API documentation, fetch from the official docs index:
**llms.txt URL**: `https://ai.google.dev/gemini-api/docs/llms.txt`
This index contains links to all documentation pages in `.md.txt` format. Use web fetch tools to:
1. Fetch `llms.txt` to discover available documentation pages
2. Fetch specific pages (e.g., `https://ai.google.dev/gemini-api/docs/function-calling.md.txt`)
### Key Documentation Pages
> [!IMPORTANT]
> Those are not all the documentation pages. Use the `llms.txt` index to discover available documentation pages
- [Models](https://ai.google.dev/gemini-api/docs/models.md.txt)
- [Google AI Studio quickstart](https://ai.google.dev/gemini-api/docs/ai-studio-quickstart.md.txt)
- [Nano Banana image generation](https://ai.google.dev/gemini-api/docs/image-generation.md.txt)
- [Function calling with the Gemini API](https://ai.google.dev/gemini-api/docs/function-calling.md.txt)
- [Structured outputs](https://ai.google.dev/gemini-api/docs/structured-output.md.txt)
- [Text generation](https://ai.google.dev/gemini-api/docs/text-generation.md.txt)
- [Image understanding](https://ai.google.dev/gemini-api/docs/image-understanding.md.txt)
- [Embeddings](https://ai.google.dev/gemini-api/docs/embeddings.md.txt)
- [Interactions API](https://ai.google.dev/gemini-api/docs/interactions.md.txt)
- [SDK migration guide](https://ai.google.dev/gemini-api/docs/migrate.md.txt)
## When to Use
This skill is applicable to execute the workflow or actions described in the overview.Related Skills
nerdzao-elite-gemini-high
Modo Elite Coder + UX Pixel-Perfect otimizado especificamente para Gemini 3.1 Pro High. Workflow completo com foco em qualidade máxima e eficiência de tokens.
gemini-api-integration
Use when integrating Google Gemini API into projects. Covers model selection, multimodal inputs, streaming, function calling, and production best practices.
nft-standards
Master ERC-721 and ERC-1155 NFT standards, metadata best practices, and advanced NFT features.
nextjs-app-router-patterns
Comprehensive patterns for Next.js 14+ App Router architecture, Server Components, and modern full-stack React development.
new-rails-project
Create a new Rails project
networkx
NetworkX is a Python package for creating, manipulating, and analyzing complex networks and graphs.
network-engineer
Expert network engineer specializing in modern cloud networking, security architectures, and performance optimization.
nestjs-expert
You are an expert in Nest.js with deep knowledge of enterprise-grade Node.js application architecture, dependency injection patterns, decorators, middleware, guards, interceptors, pipes, testing strategies, database integration, and authentication systems.
nerdzao-elite
Senior Elite Software Engineer (15+) and Senior Product Designer. Full workflow with planning, architecture, TDD, clean code, and pixel-perfect UX validation.
native-data-fetching
Use when implementing or debugging ANY network request, API call, or data fetching. Covers fetch API, React Query, SWR, error handling, caching, offline support, and Expo Router data loaders (useLoaderData).
n8n-workflow-patterns
Proven architectural patterns for building n8n workflows.
n8n-validation-expert
Expert guide for interpreting and fixing n8n validation errors.