Vertex Media Generation

## Overview

25 stars

Best use case

Vertex Media Generation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

## Overview

Teams using Vertex Media Generation should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/vertex-media-generation/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/TerminalSkills/skills/vertex-media-generation/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/vertex-media-generation/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How Vertex Media Generation Compares

Feature / Agent	Vertex Media Generation	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

## Overview

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Vertex Media Generation

## Overview

Build image and video generation features using Google Vertex AI through the Vercel AI SDK. Covers Imagen models for image generation and editing (inpainting, outpainting, background swap) and Veo models for video generation with optional audio. Uses the `@ai-sdk/google-vertex` provider with the unified `ai` SDK.

## Instructions

### Step 1: Set up the project

```bash
npm install ai @ai-sdk/google-vertex
gcloud auth application-default login
```

Use the default provider instance (reads `GOOGLE_CLOUD_PROJECT` from env), or create a custom one:

```typescript
import { vertex } from '@ai-sdk/google-vertex';
// Or: import { createVertex } from '@ai-sdk/google-vertex';
// const vertex = createVertex({ project: 'my-gcp-project', location: 'us-central1' });
```

### Step 2: Generate images with Imagen

Use `generateImage` from the `ai` package with a Vertex image model:

```typescript
import { vertex } from '@ai-sdk/google-vertex';
import { generateImage } from 'ai';

const { image } = await generateImage({
  model: vertex.image('imagen-4.0-generate-001'),
  prompt: 'A futuristic cityscape at sunset',
  aspectRatio: '16:9',
});
```

Imagen does NOT support the `size` parameter. Use `aspectRatio` instead. Supported ratios: `1:1`, `3:4`, `4:3`, `9:16`, `16:9`.

Available Imagen models:

| Model | Speed | Quality |
|-------|-------|---------|
| `imagen-4.0-ultra-generate-001` | Slow | Highest |
| `imagen-4.0-generate-001` | Medium | High |
| `imagen-4.0-fast-generate-001` | Fast | Good |
| `imagen-3.0-generate-002` | Medium | High |
| `imagen-3.0-fast-generate-001` | Fast | Good |

Configure generation with provider options:

```typescript
const { image } = await generateImage({
  model: vertex.image('imagen-4.0-generate-001'),
  prompt: 'Professional headshot portrait',
  aspectRatio: '1:1',
  providerOptions: {
    vertex: {
      negativePrompt: 'blurry, low-quality, distorted',
      personGeneration: 'allow_adult',
      safetySetting: 'block_medium_and_above',
      addWatermark: true,
    },
  },
});
```

Provider options: `negativePrompt` (exclude elements), `personGeneration` (`allow_adult` | `allow_all` | `dont_allow`), `safetySetting` (`block_low_and_above` | `block_medium_and_above` | `block_only_high` | `block_none`), `addWatermark` (boolean, default true), `storageUri` (GCS path).

### Step 3: Edit images with Imagen

Use `imagen-3.0-capability-001` for inpainting, outpainting, and background swap. Provide the source image and a mask (white pixels = area to edit):

```typescript
import { generateImage } from 'ai';
import fs from 'fs';

const sourceImage = fs.readFileSync('./photo.png');
const mask = fs.readFileSync('./mask.png');

const { images } = await generateImage({
  model: vertex.image('imagen-3.0-capability-001'),
  prompt: {
    text: 'Add a golden retriever sitting on the grass',
    images: [sourceImage],
    mask,
  },
  providerOptions: {
    vertex: {
      edit: {
        mode: 'EDIT_MODE_INPAINT_INSERTION',
        maskMode: 'MASK_MODE_USER_PROVIDED',
        baseSteps: 50,
        maskDilation: 0.01,
      },
    },
  },
});
```

Edit modes: `EDIT_MODE_INPAINT_INSERTION` (add objects), `EDIT_MODE_INPAINT_REMOVAL` (remove objects), `EDIT_MODE_OUTPAINT` (extend canvas), `EDIT_MODE_BGSWAP` (replace background), `EDIT_MODE_PRODUCT_IMAGE` (product photography), `EDIT_MODE_CONTROLLED_EDITING` (style transfer). The `baseSteps` parameter (35-75) controls quality: higher values produce better results but take longer.

### Step 4: Generate videos with Veo

Use `experimental_generateVideo` for video generation. Video generation is asynchronous and may take several minutes:

```typescript
import { vertex } from '@ai-sdk/google-vertex';
import { experimental_generateVideo as generateVideo } from 'ai';

const { video } = await generateVideo({
  model: vertex.video('veo-3.1-generate-001'),
  prompt: 'Aerial drone shot of a coral reef with tropical fish',
  aspectRatio: '16:9',
  resolution: '1920x1080',
  duration: 8,
});
```

Available Veo models:

| Model | Audio |
|-------|-------|
| `veo-3.1-generate-001` | Yes |
| `veo-3.1-fast-generate-001` | Yes |
| `veo-3.0-generate-001` | Yes |
| `veo-3.0-fast-generate-001` | Yes |
| `veo-2.0-generate-001` | No |

Configure with provider options:

```typescript
const { video } = await generateVideo({
  model: vertex.video('veo-3.1-generate-001'),
  prompt: 'Time-lapse of a flower blooming',
  aspectRatio: '16:9',
  providerOptions: {
    vertex: {
      generateAudio: true,
      personGeneration: 'allow_adult',
      negativePrompt: 'blurry, shaky, low-resolution',
      pollIntervalMs: 5000,
      pollTimeoutMs: 600000,
    },
  },
});
```

Provider options: `generateAudio` (boolean), `personGeneration`, `negativePrompt`, `gcsOutputDirectory` (GCS URI), `referenceImages` (style guidance), `pollIntervalMs` (check interval), `pollTimeoutMs` (max wait, default 10 min for long videos).

## Examples

### Example 1: Product photography pipeline

**User request:** "Generate product photos for an e-commerce listing of a ceramic mug"

**Actions taken:**

```typescript
import { vertex } from '@ai-sdk/google-vertex';
import { generateImage } from 'ai';
import fs from 'fs';

const backgrounds = [
  'Minimalist white marble countertop with soft natural lighting',
  'Cozy breakfast table with morning sunlight and croissants',
  'Modern office desk with laptop and notebook, shallow depth of field',
];

for (const [i, scene] of backgrounds.entries()) {
  const { image } = await generateImage({
    model: vertex.image('imagen-4.0-generate-001'),
    prompt: `Professional product photo of a handmade ceramic coffee mug, earth-tone glaze, ${scene}`,
    aspectRatio: '1:1',
    providerOptions: {
      vertex: {
        negativePrompt: 'text, watermark, logo, blurry, oversaturated',
        addWatermark: false,
      },
    },
  });

  fs.writeFileSync(`mug-scene-${i + 1}.png`, Buffer.from(image.base64, 'base64'));
  console.log(`Saved mug-scene-${i + 1}.png`);
}
```

**Expected output:** Three 1:1 product images saved as PNG files, each showing the mug in a different setting.

### Example 2: Video ad generation with audio

**User request:** "Create a short video ad for a hiking app launch"

**Actions taken:**

```typescript
import { vertex } from '@ai-sdk/google-vertex';
import { experimental_generateVideo as generateVideo } from 'ai';
import fs from 'fs';

const { video } = await generateVideo({
  model: vertex.video('veo-3.1-generate-001'),
  prompt: `Cinematic drone shot following a solo hiker ascending a mountain trail
at golden hour. Camera starts low behind the hiker and rises to reveal a
panoramic vista of snow-capped peaks. Style: epic, aspirational, warm color
grading. Text overlay space at the top third of the frame.`,
  aspectRatio: '9:16',
  resolution: '1080x1920',
  duration: 8,
  providerOptions: {
    vertex: {
      generateAudio: true,
      negativePrompt: 'shaky camera, low quality, overexposed, urban elements',
      pollTimeoutMs: 600000,
    },
  },
});

fs.writeFileSync('hiking-app-ad.mp4', Buffer.from(video.base64, 'base64'));
console.log('Saved hiking-app-ad.mp4');
```

**Expected output:** An 8-second vertical video with generated audio, saved as MP4.

### Example 3: Image editing — background swap

**User request:** "Replace the background of this product photo with a beach scene"

**Actions taken:**

```typescript
import { vertex } from '@ai-sdk/google-vertex';
import { generateImage } from 'ai';
import fs from 'fs';

const sourceImage = fs.readFileSync('./product-original.png');
const mask = fs.readFileSync('./background-mask.png');

const { images } = await generateImage({
  model: vertex.image('imagen-3.0-capability-001'),
  prompt: {
    text: 'Sandy tropical beach at sunset with palm trees and calm ocean waves',
    images: [sourceImage],
    mask,
  },
  providerOptions: {
    vertex: {
      edit: {
        mode: 'EDIT_MODE_BGSWAP',
        maskMode: 'MASK_MODE_USER_PROVIDED',
        baseSteps: 60,
      },
    },
  },
});

fs.writeFileSync('product-beach-bg.png', Buffer.from(images[0].base64, 'base64'));
console.log('Saved product-beach-bg.png');
```

**Expected output:** The original product preserved with a new beach background.

## Guidelines

- Always use `aspectRatio` instead of `size` for Imagen models — `size` is not supported.
- Use `imagen-4.0-generate-001` as the default for new image generation. Use `imagen-3.0-capability-001` only for editing operations.
- Set `pollTimeoutMs` to at least 600000 (10 min) for Veo video generation — it can take several minutes, especially for higher resolutions or longer durations.
- Use `negativePrompt` to refine outputs: list specific artifacts to avoid (blurry, distorted, watermark) rather than vague terms.
- For production pipelines, specify `storageUri` (images) or `gcsOutputDirectory` (videos) to write directly to Cloud Storage instead of handling base64 in memory.
- Video generation with Veo is experimental (`experimental_generateVideo`). The API may change between SDK versions.
- Models with `fast` in the name trade quality for speed — use them for drafts and iteration, switch to standard models for final output.
- `personGeneration` defaults to blocking people. Set to `allow_adult` or `allow_all` when generating content that intentionally includes people.
- GCP billing applies to all Vertex AI media generation. Imagen ultra and Veo 3.1 cost more than their standard/fast counterparts.

Related Skills

vertex-ai-media-master

from ComeOnOliver/skillshub

Automatic activation for ALL Google Vertex AI multimodal operations - video processing, audio generation, image creation, and marketing campaigns. **TRIGGER PHRASES:** - "vertex ai", "gemini multimodal", "process video", "generate audio", "create images", "marketing campaign" - "imagen", "video understanding", "multimodal", "content generation", "media assets" **AUTO-INVOKES FOR:** - Video processing and understanding (up to 6 hours) - Audio generation and transcription - Image generation with Imagen 4 - Marketing campaign automation - Social media content creation - Ad creative generation - Multimodal content workflows

vertex-infra-expert

from ComeOnOliver/skillshub

Terraform infrastructure specialist for Vertex AI services and Gemini deployments. Provisions Model Garden, endpoints, vector search, pipelines, and enterprise AI infrastructure. Triggers: "vertex ai terraform", "gemini deployment terraform", "model garden infrastructure", "vertex ai endpoints"

vertex-engine-inspector

from ComeOnOliver/skillshub

Inspect and validate Vertex AI Agent Engine deployments including Code Execution Sandbox, Memory Bank, A2A protocol compliance, and security posture. Generates production readiness scores. Use when asked to inspect, validate, or audit an Agent Engine deployment. Trigger with "inspect agent engine", "validate agent engine deployment", "check agent engine config", "audit agent engine security", "agent engine readiness check", "vertex engine health", or "reasoning engine status".

vertex-ai-pipeline-creator

from ComeOnOliver/skillshub

Vertex Ai Pipeline Creator - Auto-activating skill for GCP Skills. Triggers on: vertex ai pipeline creator, vertex ai pipeline creator Part of the GCP Skills skill category.

vertex-ai-endpoint-config

from ComeOnOliver/skillshub

Vertex Ai Endpoint Config - Auto-activating skill for GCP Skills. Triggers on: vertex ai endpoint config, vertex ai endpoint config Part of the GCP Skills skill category.

vertex-ai-deployer

from ComeOnOliver/skillshub

Vertex Ai Deployer - Auto-activating skill for ML Deployment. Triggers on: vertex ai deployer, vertex ai deployer Part of the ML Deployment skill category.

vertex-agent-builder

from ComeOnOliver/skillshub

Build and deploy production-ready generative AI agents using Vertex AI, Gemini models, and Google Cloud infrastructure with RAG, function calling, and multi-modal capabilities

hypermedia-link-generator

from ComeOnOliver/skillshub

Hypermedia Link Generator - Auto-activating skill for API Development. Triggers on: hypermedia link generator, hypermedia link generator Part of the API Development skill category.

firebase-vertex-ai

from ComeOnOliver/skillshub

Execute firebase platform expert with Vertex AI Gemini integration for Authentication, Firestore, Storage, Functions, Hosting, and AI-powered features. Use when asked to "setup firebase", "deploy to firebase", or "integrate vertex ai with firebase". Trigger with relevant phrases based on skill purpose.

apify-lead-generation

from ComeOnOliver/skillshub

Generates B2B/B2C leads by scraping Google Maps, websites, Instagram, TikTok, Facebook, LinkedIn, YouTube, and Google Search. Use when user asks to find leads, prospects, businesses, build lead lists, enrich contacts, or scrape profiles for sales outreach.

../../../marketing-skill/social-media-manager/SKILL.md

from ComeOnOliver/skillshub

No description provided.

social-media

from ComeOnOliver/skillshub

Social media strategy, content creation, and platform optimization. Use when creating social content, developing engagement strategies, optimizing for platform algorithms, or building community.