Vertex AI Media Master
Execute automatic activation for all google vertex ai multimodal operations operations. Use when appropriate context detected. Trigger with relevant phrases based on skill purpose.
Best use case
Vertex AI Media Master is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Execute automatic activation for all google vertex ai multimodal operations operations. Use when appropriate context detected. Trigger with relevant phrases based on skill purpose.
Teams using Vertex AI Media Master should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/vertex-ai-media-master/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How Vertex AI Media Master Compares
| Feature / Agent | Vertex AI Media Master | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Execute automatic activation for all google vertex ai multimodal operations operations. Use when appropriate context detected. Trigger with relevant phrases based on skill purpose.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
SKILL.md Source
# Vertex AI Media Master
## Overview
Multimodal media operations on Google Cloud Vertex AI covering video understanding, audio generation, image creation, and marketing campaign automation. This skill orchestrates Gemini 2.5 Pro/Flash, Imagen 4, and Lyria models to process, analyze, and generate rich media assets.
## Prerequisites
- Google Cloud project with Vertex AI API enabled
- `google-cloud-aiplatform` Python SDK installed (`pip install google-cloud-aiplatform[vision,audio]`)
- `GOOGLE_CLOUD_PROJECT` and `GOOGLE_APPLICATION_CREDENTIALS` environment variables set
- Service account with `roles/aiplatform.user` permission
- Sufficient quota for target models (Gemini 2.5 Pro: 2M tokens/min; Imagen 4: 100 images/min)
## Instructions
1. Initialize the Vertex AI client with the target project and region (`us-central1` recommended for model availability).
2. Select the appropriate model for the task:
- **Video analysis**: Gemini 2.5 Pro (up to 6 hours at low resolution, 2 hours at default).
- **Image generation**: Imagen 4 for highest quality stills; Gemini 2.5 Flash Image for interleaved text+image output.
- **Audio generation**: Lyria for music composition and background tracks.
- **Campaign automation**: Gemini 2.5 Pro for multi-asset generation from a single prompt.
3. Prepare input media: upload source files to Cloud Storage (`gs://` URIs) or provide local paths for smaller assets.
4. Construct the generation request with explicit parameters (aspect ratio, duration, number of outputs, style constraints).
5. Execute the request and capture response objects containing generated media bytes or analysis text.
6. Post-process outputs: save generated images/audio to the target directory, extract structured insights from video analysis, or compile campaign asset bundles.
7. Validate results against brand guidelines or schema expectations before delivery.
## Output
- Generated image files (PNG/JPEG) from Imagen 4 or Gemini Flash Image
- Audio files (WAV/MP3) from Lyria model for background music, voiceovers, or sound effects
- Video analysis reports: scene breakdowns, key-moment timestamps, transcript text, marketing-insight summaries
- Campaign asset packages: hero images, social media graphics, ad copy, email marketing text, and video scripts
- Structured JSON metadata for each generated asset (model used, prompt, parameters, cost estimate)
## Error Handling
| Error | Cause | Solution |
|-------|-------|----------|
| `PermissionDenied` on Vertex AI API | Service account lacks `aiplatform.user` role | Grant the required IAM role to the service account |
| `ResourceExhausted` / quota exceeded | Too many concurrent requests or token limit hit | Implement request batching; switch to Gemini 2.5 Flash for lower-cost operations |
| `InvalidArgument` on image generation | Prompt violates safety filters or unsupported aspect ratio | Revise the prompt to remove restricted content; use a supported aspect ratio (1:1, 16:9, 9:16) |
| Video processing timeout | Source video exceeds duration or resolution limits | Use low-resolution mode for videos over 2 hours; split longer videos into segments |
| Audio generation returns empty | Prompt too vague or duration parameter missing | Specify genre, tempo, mood, and an explicit duration in seconds |
| `NotFound` on model ID | Incorrect model name or model not available in region | Verify the model ID against current Vertex AI documentation; try `us-central1` |
## Examples
**Example 1: Analyze a competitor video ad**
- Input: A 60-second competitor video uploaded to `gs://bucket/competitor-ad.mp4`.
- Action: Send to Gemini 2.5 Pro with the prompt "Extract messaging themes, calls to action, visual style, and production techniques."
- Output: Structured analysis with timestamps for key scenes, identified CTAs, and a competitive positioning summary.
**Example 2: Generate campaign assets from a product brief**
- Input: Text brief describing a new product launch with target audience and brand guidelines.
- Action: Use Imagen 4 to generate 4 hero image variations, Lyria for a 30-second background track, and Gemini 2.5 Pro for ad copy in 3 languages.
- Output: Directory containing hero images, audio file, and a campaign-copy document organized by language.
**Example 3: Repurpose a long-form video into short-form clips**
- Input: A 10-minute product demo video.
- Action: Gemini 2.5 Pro identifies the three most engaging 15-second segments with scene-boundary timestamps.
- Output: Timestamp list with suggested captions for TikTok/Reels, plus a storyboard summary for each clip.
## Resources
- Detailed model capabilities and code patterns: `${CLAUDE_SKILL_DIR}/references/core-capabilities.md`
- Vertex AI Multimodal overview: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/overview
- Imagen documentation: https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview
- Video understanding guide: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/video-understanding
- GenAI for Marketing reference repo: https://github.com/GoogleCloudPlatform/genai-for-marketingRelated Skills
Google Cloud Agent SDK Master
Execute automatic activation for all google cloud agent development kit (adk) Use when appropriate context detected. Trigger with relevant phrases based on skill purpose.
firebase-vertex-ai
Execute firebase platform expert with Vertex AI Gemini integration for Authentication, Firestore, Storage, Functions, Hosting, and AI-powered features. Use when asked to "setup firebase", "deploy to firebase", or "integrate vertex ai with firebase". Trigger with relevant phrases based on skill purpose.
vertex-engine-inspector
Inspect and validate Vertex AI Agent Engine deployments including Code Execution Sandbox, Memory Bank, A2A protocol compliance, and security posture. Generates production readiness scores. Use when asked to inspect, validate, or audit an Agent Engine deployment. Trigger with "inspect agent engine", "validate agent engine deployment", "check agent engine config", "audit agent engine security", "agent engine readiness check", "vertex engine health", or "reasoning engine status".
hypermedia-link-generator
Hypermedia Link Generator - Auto-activating skill for API Development. Triggers on: hypermedia link generator, hypermedia link generator Part of the API Development skill category.
vertex-ai-pipeline-creator
Vertex Ai Pipeline Creator - Auto-activating skill for GCP Skills. Triggers on: vertex ai pipeline creator, vertex ai pipeline creator Part of the GCP Skills skill category.
vertex-ai-endpoint-config
Vertex Ai Endpoint Config - Auto-activating skill for GCP Skills. Triggers on: vertex ai endpoint config, vertex ai endpoint config Part of the GCP Skills skill category.
vertex-ai-deployer
Vertex Ai Deployer - Auto-activating skill for ML Deployment. Triggers on: vertex ai deployer, vertex ai deployer Part of the ML Deployment skill category.
vertex-agent-builder
Build and deploy production-ready generative AI agents using Vertex AI, Gemini models, and Google Cloud infrastructure with RAG, function calling, and multi-modal capabilities
vertex-infra-expert
Terraform infrastructure specialist for Vertex AI services and Gemini deployments. Provisions Model Garden, endpoints, vector search, pipelines, and enterprise AI infrastructure. Triggers: "vertex ai terraform", "gemini deployment terraform", "model garden infrastructure", "vertex ai endpoints"
yaml-master
PROACTIVE YAML INTELLIGENCE: Automatically activates when working with YAML files, configuration management, CI/CD pipelines, Kubernetes manifests, Docker Compose, or any YAML-based workflows. Provides intelligent validation, schema inference, linting, format conversion (JSON/TOML/XML), and structural transformations with deep understanding of YAML specifications and common anti-patterns.
schema-optimization-orchestrator
Multi-phase schema optimization workflow orchestrator. Creates session directories, spawns phase agents sequentially, validates outputs, aggregates results. Trigger: "run schema optimization", "optimize schema workflow", "execute schema phases"
test-skill
Test skill for E2E validation. Trigger with "run test skill" or "execute test". Use this skill when testing skill activation and tool permissions.