AI Creative with Veo 3, Runway & Flux for Google Ads 2026

Key takeaway

AI creative generation changed the economics of Google Ads creative in 2026: the cost and time to produce a tested variant collapsed, which matters enormously for Performance Max and Demand Gen, the formats that eat creative volume to find winners. The three tools that anchor a practical pipeline are Veo 3 (Google's video model, high-coherence clips with native audio, integrates with the Google ecosystem), Runway (the stronger choice when you need editing control — motion direction, video-to-video, sequence consistency), and Flux (fast, cheap, controllable image generation at scale). They are complementary, not competing — most real pipelines use all three plus a hero asset from traditional production. The win is not any single asset's polish; it is producing the full set of aspect ratios (16:9, 1:1, 9:16) and lengths (6s, 15s, 30s) of a concept cheaply enough to fill every placement and feed the algorithm dozens of variants to test. The hard part is not generation — it is the brand brief that governs output, the human curation gate that rejects the off-brand and the glitched, and the rights/disclosure discipline around AI-generated people.

For a decade, the binding constraint on Google Ads creative was production cost. Filling a Performance Max asset group properly — multiple images in every aspect ratio, video in three orientations, several lengths — meant either a real shoot with its real budget or a long tail of asset gaps the algorithm could never test. By 2026, AI generation collapsed that constraint. Veo 3, Runway, and Flux made it cheap and fast to produce the variant volume that PMax and Demand Gen are built to consume, and that shifted the bottleneck from "can we afford to make this" to "can we curate and brief well enough to make it good."

This guide is a practical walkthrough for creative and growth teams who want to run an AI creative pipeline for Google Ads. It is hands-on rather than speculative: how the three tools compare and where each fits, the brief-to-approved-asset workflow, the exact asset specs for Demand Gen, Performance Max, and YouTube, and the cost, rights, and disclosure realities of running this at account scale. The audience is people producing creative for live campaigns, not researchers — the focus is on what ships and performs.

Volume beats polish — for the variants, not the hero :

The instinct from the traditional-production era is to make one beautiful asset. That instinct is exactly wrong for the formats that now dominate Google Ads delivery. Performance Max and Demand Gen are variant-testing engines: they take many assets, combine them, serve the combinations, and concentrate spend on what works. The advertiser's job is to give the engine enough good, on-brand variation to find winners — and AI creative's superpower is producing that variation cheaply. A team that ships one polished hero and three asset-group fillers will lose to a team that ships one polished hero and forty tested AI variants, because the second team gave the algorithm a real search space. Reserve craft and human production for the hero that carries the brand; use AI to multiply the tested variants around it. The mistake is inverting that — using AI for the hero and skimping on the variant volume that actually moves performance.

Why AI creative changed Google Ads economics in 2026

To understand why AI creative matters so much for Google Ads specifically — more than for, say, a single brand film — you have to understand how modern Google Ads campaign types consume creative.

Performance Max and Demand Gen do not run "an ad." They run asset groups: collections of images, videos, headlines, descriptions, and logos that the system combines into many possible ad permutations, serves across the entire Google inventory (Search, Shopping, YouTube, Discover, Gmail, Display, Maps), measures, and optimizes toward the combinations that perform. The more — and more varied — the assets you supply, the larger the search space the algorithm can explore, and the better the winning combination it can find. An asset group with two images and one video gives the algorithm almost nothing to optimize; one with a dozen images across aspect ratios and several videos across orientations gives it real room to work.

In the traditional-production era, filling asset groups properly was expensive enough that most advertisers under-supplied them. A shoot produced a hero video and a handful of stills; turning those into the full matrix of 16:9, 1:1, and 9:16 video cuts and 1.91:1, 1:1, and 4:5 image crops meant more editing budget, so teams shipped partial asset groups and left performance on the table. The asset-coverage gap was a direct, measurable drag on PMax and Demand Gen results.

AI creative removes the cost barrier to filling that matrix. Generating the vertical, square, and horizontal cuts of a concept, or producing forty on-brand background variations, or making a 6-second and a 15-second cut of the same idea, now costs cents and minutes rather than hundreds of euros and days. That changes the strategy: instead of rationing creative because it is expensive, you produce abundance because it is cheap, and you let the algorithm sort it out. The economic shift is from scarcity-driven creative (make few, make them count) to abundance-driven creative (make many, test ruthlessly, keep winners).

There is a second-order effect: creative refresh cadence. Ad creative fatigues — performance decays as audiences see the same assets repeatedly. In the traditional model, refresh was slow and expensive, so creative often ran long past its prime. With an AI pipeline, refreshing the variant set is cheap enough to do on a regular cadence, which keeps fatigue at bay and gives the algorithm fresh material to test. The combination of cheap initial volume and cheap refresh is what makes AI creative an economic step-change rather than a marginal improvement for Google Ads.

None of this means polish stopped mattering. It means polish moved to where it counts — the hero asset and the brand craft — while the high-volume, multi-aspect-ratio, frequently-refreshed variant layer that PMax and Demand Gen feed on became cheap to produce at quality. That reallocation is the real story of AI creative in Google Ads.

The three tools: Veo 3, Runway, Flux compared

The three tools that anchor a practical 2026 Google Ads creative pipeline each occupy a distinct role. Understanding the division of labor matters more than picking a single "best" tool, because a real pipeline uses all three.

Veo 3 is Google's flagship video-generation model, accessible through the Gemini app, the Flow filmmaking tool, Google AI Studio, and Vertex AI for programmatic use. Its standout properties for advertising are high temporal coherence (objects and scenes stay consistent across the clip rather than morphing) and native audio generation (it can produce synchronized sound and even speech), which most competing models did not do natively. For a Google Ads team, Veo's ecosystem fit is a genuine advantage: it is the same vendor as the ad platform, the output is designed to work in YouTube and Demand Gen contexts, and the Vertex AI path supports programmatic generation at scale.

Runway (the Gen-4 generation of models and its tooling) is the control-oriented choice. Where Veo excels at "generate a coherent clip from a prompt," Runway excels at directing the generation: motion brushes to control what moves and how, camera-movement controls, video-to-video restyling (take a real clip and restyle it), and tooling for maintaining consistency across a multi-shot sequence. This is what you reach for when you need a crafted, controlled result rather than a fast variant — the shots where the director's intent matters. Runway is a production tool with AI inside, more than a prompt-to-clip box.

Flux (the Flux family of image models from Black Forest Labs) is the image-generation workhorse: fast, high-quality, controllable, and cheap per image through API providers and apps. For Google Ads, where image assets span many aspect ratios across Demand Gen and PMax, Flux's value is volume at quality with reference-conditioning support — you condition generation on approved brand imagery to keep output on-brand, and you generate the dozens of image variants and aspect-ratio crops an asset group wants for cents apiece.

The division of labor in a typical pipeline: Flux produces the image variant volume across every aspect ratio; Veo 3 produces fast, coherent base video clips with sound for the workhorse video placements; Runway handles the shots that need directorial control or restyling; and traditional production (or a high-touch Runway/Veo workflow) produces the one hero asset that anchors the brand campaign. Trying to do everything in one tool produces worse results than using each for its strength — Flux for images, Veo for fast video, Runway for controlled video.

Veo 3 for video: what it does well and badly

Veo 3 is where most Google Ads teams will produce the bulk of their video, so it pays to know its strengths and failure modes precisely.

What Veo 3 does well. Temporal coherence is its headline strength — a generated clip holds together over its duration, with objects, lighting, and scene staying consistent rather than the morphing and flicker that plagued earlier video models. Native audio is the second: Veo can generate synchronized sound effects, ambient audio, and speech, which means a clip arrives closer to finished rather than needing a separate audio pass. Prompt adherence is strong for scene composition and action. And the ecosystem integration — generating through Vertex AI for programmatic pipelines, or through Flow for a more directed filmmaking workflow — gives teams both a scale path and a craft path.

What Veo 3 does badly (the failure modes to curate against). Text rendering inside the video remains unreliable — on-screen text frequently comes out garbled, so do not rely on the model to render your tagline or product name; add text in post instead. Faces and hands, the classic generative-AI weak spots, still produce occasional uncanny or anatomically wrong results, especially in close-up and motion. Physics can glitch — objects passing through each other, implausible motion, liquids behaving wrongly — particularly in complex scenes. Precise product fidelity is hard: if your product has exact branding, proportions, or detail that must be accurate, the model will approximate it, and that approximation may not be accurate enough for a product demo. And fine-grained directorial control (exact camera moves, precise timing) is weaker than in a control-oriented tool like Runway.

The practical workflow implication: Veo 3 is excellent for generating base clips — scenes, moods, b-roll, lifestyle context — that you then finish, with text and precise product shots added in post. It is weaker as a one-shot finished-ad generator. The teams that get the most from it generate many short base clips, curate hard against the failure modes (reject the garbled-text, the wrong-hands, the physics-glitch outputs), and assemble the survivors into finished ads with text and product accuracy handled separately.

A note on the usable rate: early in a pipeline, expect roughly 1 in 5 to 1 in 10 Veo generations to be usable after curation, improving as you learn the prompt patterns that avoid the failure modes. This is normal and is why curation labor, not generation cost, dominates the budget. Build the curation step into the workflow rather than treating every generation as shippable.

One workflow tactic that lifts the usable rate quickly: generate in small batches around a single tightly-specified prompt, pick the best output, then iterate the prompt based on what failed rather than changing everything at once. Veo responds well to specific scene description (subject, action, setting, lighting, camera framing, mood) and poorly to vague creative direction, so the prompt patterns that survive into your template library tend to be concrete and structured. Keep a running note of which phrasings reliably avoid the failure modes — for instance, framing that keeps faces at mid-distance rather than extreme close-up, or scenes without on-screen text dependency — and feed those learnings back into the brief so the whole team benefits.

Runway for video editing and control

Runway earns its place in the pipeline when prompt-to-clip generation is not enough — when you need to direct the result. Its toolset is built around control rather than pure generation, which makes it the right tool for the more crafted shots in a campaign.

Motion and camera control. Runway lets you specify what moves and how — motion brushes to indicate which parts of a frame should animate and in what direction, and camera controls to direct pans, zooms, and dolly moves. For an advertiser this matters when the shot has intent: a controlled push-in on a product, a specific motion that serves the concept, rather than whatever the model decides to animate. This directorial control is the difference between a generated clip and a designed one.

Video-to-video and restyling. One of Runway's most useful advertising capabilities is taking an existing clip — a real product shot, a piece of stock, a previous render — and restyling it while preserving structure and motion. This lets you maintain product accuracy (start from a real shot) while applying a consistent stylistic treatment across a campaign, sidestepping the product-fidelity weakness of pure text-to-video.

Sequence consistency. For a multi-shot piece, keeping characters, settings, and style consistent across shots is hard with one-shot generation. Runway's tooling for reference and consistency across a sequence is stronger, which makes it the better choice when a video tells a small story across several shots rather than living in a single clip.

The role Runway plays in the pipeline: it handles the shots between "fast base clip" (Veo's job) and "full traditional production" (the hero's job) — the controlled, crafted, sequence-consistent shots that need more direction than a prompt but less than a film crew. In practice, many teams use Runway for the brand-adjacent video that needs to feel designed, and Veo for the high-volume base clips. Using both deliberately — Veo for speed and volume, Runway for control and craft — produces a better result than forcing either tool to do the other's job.

The same curation discipline applies: even with more control, Runway output needs human review against the brand checklist before it ships. Control reduces the failure rate; it does not eliminate the need for a gate.

Flux for image generation at scale

Image assets are the highest-volume creative need in Google Ads — every Demand Gen and Performance Max asset group wants multiple images in several aspect ratios, and that demand multiplies across campaigns and refresh cycles. Flux is the tool that meets it economically.

Why Flux for ad images. Flux delivers high image quality with strong prompt adherence at a cost of cents per image through API providers, which is exactly the profile a high-volume image pipeline needs. It handles the things ad images require — clean compositions, product-context scenes, lifestyle imagery, backgrounds — and does so fast enough to generate dozens of variants in the time a designer would take to produce one.

Reference conditioning for brand consistency. Flux's support for reference/conditioning images is what makes it usable for brand work rather than generic stock-style output. By conditioning generation on approved brand imagery — your colour treatment, your product, your visual style — you keep output on-brand instead of relying on the model's defaults, which drift generic. This is the technical mechanism behind brand consistency: not hoping the prompt is detailed enough, but conditioning the generation on references that encode the brand visually.

Aspect-ratio coverage. The single most valuable thing Flux does for a Google Ads pipeline is cheaply produce the same concept across every required aspect ratio: the 1.91:1 landscape, the 1:1 square, the 4:5 portrait, plus the logo ratios. In traditional production, each crop is design labor; with Flux, generating or adapting a concept across ratios is a few generations and a curation pass. This is precisely the asset-coverage gap that used to drag on PMax performance, closed cheaply.

The workflow. Generate a concept batch conditioned on references, curate against the brand checklist (on-palette, product-accurate, no artifacts, no unintended likeness), then produce the aspect-ratio set for the survivors. Tag and version the approved images so you can identify winners later from the asset-performance data. The usable rate for images is typically higher than for video — image generation is more mature and the failure modes are more catchable — but the curation gate still applies. Flux turns the image side of an asset group from a budget line into a fast, cheap, on-brand production step, which is exactly what high-volume Google Ads creative needs.

The brief-to-approved-asset workflow

The tools are not the hard part. The workflow that turns a brief into approved, on-brand, performance-ready assets is where pipelines succeed or fail. Six stages, each with a clear job.

Stage 1: The brief. Everything starts with a tight, reusable brand brief. It specifies the palette as exact hex values, the typography, logo usage and placement, the tone, the mandatory product-accuracy rules, and — critically — an explicit prohibited list: no implied endorsements, no off-palette output, no inaccurate product depiction, no unintended likeness. A vague brief produces off-brand output no matter how capable the model is; the brief is the brand guardrail. It is also reusable: written once, it governs every generation and only gets refined as you learn.

Stage 2: Reference gathering. Collect approved brand imagery to condition Flux and Runway generation. References are how brand consistency gets enforced technically rather than hoped for — the model conditions on your actual brand visuals instead of its generic defaults.

Stage 3: Generation. Produce the concept batch: image concepts in Flux conditioned on references, base video clips in Veo 3, controlled shots in Runway. Expect a usable rate of 1 in 5 to 1 in 10 early on, improving with learned prompt patterns. Generation is cheap; this stage is fast.

Stage 4: The curation gate. The most important human step. Every generation is reviewed against a brand checklist before it advances: on palette, product-accurate, no anatomy or physics glitches, no garbled text, no unintended likeness, on tone. Reject ruthlessly — the algorithm rewards a smaller set of strong assets over a large set of mediocre ones, and shipping off-brand or glitched assets damages both performance and brand. This gate is what makes the pipeline a controllable production system rather than a vending machine.

Stage 5: Multi-aspect-ratio production. For each approved concept, produce the full matrix of aspect ratios and lengths the placements need, conditioning each cut on the same reference for consistency. This is where AI creative's economics pay off — the cuts that traditional production would bill three times for cost a few generations and a curation pass.

Stage 6: Finishing and export. Add text and precise product shots in post (do not rely on the model to render text or exact product detail), export at required resolutions, and keep provenance metadata intact. Tag and version assets so the later performance review can attribute wins to concepts.

The throughline is that human judgment moves from production (which AI now does) to briefing and curation (which AI cannot). The team's value-add shifts from making each asset by hand to governing what the model produces — a tighter brief, better references, and a stricter curation gate produce better creative than any prompt-engineering trick. Teams that internalize this run productive pipelines; teams that treat AI as a magic asset dispenser ship off-brand noise.

The team's first month of AI creative underperformed their old hand-made ads, and they nearly abandoned it. The problem was not the tools — it was that they had no brief and no curation gate, so they were shipping whatever the model produced. We added a one-page brand brief and a five-point curation checklist, and rejected eight of every ten generations. The next month's AI variants beat the hand-made baseline, because the survivors were on-brand and there were forty of them for the algorithm to test instead of four. AI did not replace the creative team's judgment; it relocated it from the mouse to the brief and the gate.

— From a 2026 creative-pipeline build

Asset specs for Demand Gen, PMax and YouTube

Producing assets is wasted effort if they do not meet the specs the placements require. Here are the core 2026 specs to produce against, organized by what each campaign type consumes.

Image assets (Demand Gen and Performance Max). Supply, at minimum, the three workhorse aspect ratios:

Landscape 1.91:1 — 1200x628 (the classic horizontal),
Square 1:1 — 1200x1200,
Portrait 4:5 — 960x1200.

Plus logo assets in 1:1 (1200x1200) and 4:1 (1200x300). Performance Max and Demand Gen reward filling these fully with multiple images per ratio — more images give the algorithm more combinations. This multi-ratio requirement is exactly what makes Flux valuable: producing the same concept across all three ratios is cheap.

Video assets (YouTube, Demand Gen, Performance Max). Cover the three orientations to reach every placement:

16:9 horizontal — in-stream and standard YouTube,
1:1 square — in-feed,
9:16 vertical — Shorts and vertical placements, increasingly the highest-volume surface.

Resolution: 1080p minimum. Lengths to produce:

6-second bumpers for reach and frequency,
15-30 seconds for the main workhorse video,
longer skippable in-stream where the story justifies it.

The 9:16 vertical cut deserves emphasis: vertical placement volume has grown to the point where a campaign without 9:16 video is leaving a large share of inventory untouched, and producing vertical from a horizontal shoot is exactly the kind of re-cut AI tools do cheaply.

Performance Max asset-group completeness. PMax specifically rewards a fully-populated asset group: multiple images in each ratio, multiple videos in each orientation, several headlines and descriptions, logos. An under-filled asset group constrains the algorithm; a full one gives it a real search space. AI creative's economics make filling the group properly affordable for the first time, which is why the pairing of AI production with PMax is so productive. Our Performance Max asset strategy guide covers how to structure asset groups and read the asset-performance labels.

Demand Gen specifics. Demand Gen (the successor to Discovery campaigns) runs across YouTube, Discover, and Gmail with a visual, feed-style format. It wants the same image ratios plus video, and it benefits from lifestyle, in-context creative rather than hard-sell product shots — which AI tools, conditioned on brand references, produce well.

The practical takeaway: build a spec sheet listing every required ratio, length, and resolution, and produce against it systematically for each concept. The AI pipeline makes hitting the full spec matrix cheap; the discipline is producing the complete set rather than the convenient subset.

Cost, rights, and disclosure at account scale

Running an AI creative pipeline at account scale raises three practical concerns beyond generation quality: what it costs, who owns the rights, and what you must disclose.

Cost. The generation cost is modest and the labor cost dominates. As 2026 order-of-magnitude figures: Veo 3 video runs from a few cents to a couple of euros per usable clip depending on length, resolution, and the access path (Vertex AI per-second pricing versus consumer-tier generations); Runway sells credit-bearing subscriptions in the tens-to-low-hundreds of euros monthly; Flux images cost cents apiece through API providers. For a mid-sized account producing 40-80 variants a month, tool credits land in the low hundreds of euros — but the real cost is the creative operator's time to brief, curate (rejecting the majority of generations), and finish. Budget for curation labor explicitly; it is the line item that determines output quality, and it is far cheaper than the thousands per finished video that traditional production costs. The economic case is strong, but it is a "cheap tools, paid human judgment" model, not a free one.

Rights and ownership. Three issues to manage. First, output ownership and commercial use — the terms of each tool govern whether you can use output commercially and what rights you have; the major tools' paid/commercial tiers permit advertising use, but read the terms for your tier rather than assuming. Second, likeness and right of publicity — AI can inadvertently generate output resembling a real, identifiable person, which creates likeness and false-endorsement exposure; avoid prompting for named individuals, review generated people for unintended resemblance, and for anything depicting a real person, get explicit rights, because AI does not change likeness law. Third, training-data and IP concerns — there is ongoing legal uncertainty around generative-model training data; the practical mitigation for advertisers is to condition on your own brand references, avoid prompting in the style of specific living artists or recognizable IP, and keep generations original to your brand.

Disclosure and provenance. Synthetic-media disclosure is tightening, and advertisers must keep current with it. Key points: Google requires disclosure of synthetic content in sensitive categories (notably election advertising, where AI-altered content must be disclosed) and is broadening labeling generally; AI output increasingly carries provenance metadata (SynthID watermarking on Veo output, C2PA content credentials) that platforms read and may surface — keep this metadata intact rather than stripping it; and regulation such as the EU AI Act imposes transparency obligations around synthetic media depicting real people. The advertiser's safe posture: ensure creative is truthful (an AI-generated demo must show what the product actually does), comply with category-specific disclosure rules, preserve provenance metadata, and disclose when in doubt on a sensitive category. Truthfulness is the non-negotiable core — AI makes it easy to generate a compelling depiction of a product doing something it does not do, and that is both a policy violation and a trust risk.

Handled with discipline, an AI creative pipeline gives a Google Ads account something it could not afford before: enough on-brand, spec-complete, frequently-refreshed creative variation to actually feed Performance Max and Demand Gen the search space they need — at a fraction of traditional cost, with the human team's judgment redeployed from production to briefing, curation, and the rights and disclosure discipline that keeps it safe.

If you want a review of whether your creative is actually feeding Google's algorithms the variety they need — asset-group completeness, aspect-ratio coverage, and refresh cadence — alongside the bidding and structure analysis, SteerAds runs a free 14-day audit that includes a creative-coverage review.

For related reading, see our Performance Max asset strategy guide and our overview of YouTube and video ad formats for Google Ads.

Sources

Official and third-party sources consulted for this guide:

deepmind.google/models/veo
— official Veo model documentation, capabilities, audio generation, and SynthID watermarking
cloud.google.com/vertex-ai — video generation
— Veo on Vertex AI, programmatic generation, pricing model
runwayml.com
— Runway Gen-4 capabilities, motion/camera control, video-to-video, and credit pricing
support.google.com — asset specs
— official Google Ads image and video asset specifications for Demand Gen, Performance Max, and YouTube
blackforestlabs.ai
— Flux image model documentation, capabilities, and reference conditioning

FAQ

Which AI tool is best for Google Ads video — Veo 3, Runway, or something else?

For Google Ads specifically, Veo 3 (Google's own video generation model, available through the Gemini app, Google AI Studio, and Vertex AI) is the natural starting point in 2026 because it produces high-coherence clips with native audio and integrates into the Google ecosystem, which matters when you are producing for YouTube and Demand Gen. Runway (Gen-4 and successors) is the stronger choice when you need editing control — motion brushes, camera direction, video-to-video restyling, and frame-level consistency across a sequence — which is what you want for more crafted brand spots rather than quick variants. The honest answer is that they are complementary: Veo 3 for fast, coherent base clips with sound, Runway for the shots where you need directorial control. Most teams running a real AI creative pipeline use both rather than picking one, and reserve traditional production for the hero asset that anchors the campaign.

Can I use AI-generated video and images directly in Google Ads, or are there policy restrictions?

You can use AI-generated creative in Google Ads, but three policy layers apply. First, Google Ads' general creative policies still apply — no misleading claims, no prohibited content, accurate representation of the product. Second, Google requires disclosure of synthetic content in certain sensitive categories (notably election ads, where AI-altered content must be disclosed) and is expanding labeling broadly. Third, AI-generated media often carries provenance metadata (SynthID watermarking on Veo output, C2PA content credentials) that platforms increasingly read and may surface to users. The practical rule for advertisers: AI creative is fully usable for product and brand advertising, but you must ensure the creative is truthful (an AI-generated demo must show what the product actually does), you must comply with category-specific disclosure rules, and you should keep your provenance metadata intact rather than stripping it. When in doubt on a sensitive category, disclose.

What does an AI creative pipeline actually cost to run for a Google Ads account?

Far less than traditional production, but not free, and the cost is in credits plus labor rather than studio time. As rough 2026 order-of-magnitude figures: Veo 3 video generation is priced per second of output through Vertex AI or per generation in the Gemini/Flow consumer tiers, landing in the range of a few cents to a couple of euros per usable clip depending on length and resolution; Runway sells credit packs (a typical Standard/Pro subscription runs tens to low-hundreds of euros monthly with credits included); Flux image generation is cents per image through API providers. The dominant cost is not the raw generation — it is the human time to write good briefs, curate the 1-in-5-to-1-in-10 generations that are actually usable, and edit them into finished ads. Budget for the curation labor. A realistic monthly cost for a mid-sized account producing 40-80 creative variants is a few hundred euros in tool credits plus the time of one creative operator, versus thousands per finished video in traditional production.

Is AI-generated creative good enough to beat human-made ads in Google Ads, or does performance suffer?

It depends on what you are replacing and how you use it. For high-volume variant production — the many headlines, backgrounds, and short clips that Performance Max and Demand Gen consume to find what works — AI creative routinely matches or beats hand-made variants because the win comes from volume and iteration speed, not from any single asset's polish. The algorithm tests many AI variants cheaply and surfaces the winners. For the hero brand asset that carries emotional weight and brand craft, AI in 2026 is close but not consistently better than a strong human team, and the failure modes (uncanny faces, physics glitches, text rendering) still show up. The pragmatic pattern: use AI for the long tail of variants where volume and speed win, and reserve human craft for the one or two hero assets that anchor brand campaigns. Performance suffers when teams use AI for the hero and skimp on curation; it improves when teams use AI to multiply tested variants.

What are the exact asset specs I need for Demand Gen, Performance Max, and YouTube?

The core specs in 2026: for image assets across Demand Gen and PMax, supply landscape 1.91:1 (1200x628), square 1:1 (1200x1200), and portrait 4:5 (960x1200) at minimum, plus logo assets in 1:1 and 4:1. For video, YouTube and Demand Gen want 16:9 horizontal, 1:1 square, and 9:16 vertical to cover in-stream, in-feed, and Shorts placements; aim for 1080p minimum, with vertical 9:16 increasingly the highest-volume placement. Video length: 6-second bumpers for reach, 15-30 seconds for the main workhorse, and longer for in-stream skippable where the story justifies it. Performance Max wants the full asset group filled — multiple images per aspect ratio, multiple videos per orientation, several headlines and descriptions — because more assets give the algorithm more combinations to test. The AI tooling makes filling all these aspect ratios cheap, which is precisely where it earns its place: generating the 9:16, 1:1, and 16:9 cuts of the same concept that traditional production would charge three times for.

How do I keep brand consistency when generating creative with AI across many variants?

Brand consistency under AI generation comes from three controls. First, a tight, reusable brief that specifies palette (exact hex values), typography, tone, mandatory elements (logo placement, product accuracy), and prohibited elements — the brief is your brand guardrail, and a vague brief produces off-brand output no matter how good the model is. Second, reference conditioning: Flux and Runway both accept reference images, so you condition generation on approved brand imagery rather than relying on the model's defaults, which keeps colour, style, and product likeness on-brand. Third, a curation gate: a human reviews every generation against a brand checklist before it ships, rejecting the off-palette, the anatomically wrong, and the off-tone. The mistake teams make is treating AI as a vending machine — prompt in, ad out. The teams that keep brand consistency treat it as a controllable production tool: conditioned on references, governed by a brief, and gated by human curation.

Will AI-generated faces and people in ads cause legal or likeness problems?

Potentially, and this is the area to be most careful in. Three risks: first, AI models can inadvertently generate output resembling a real, identifiable person, which raises right-of-publicity and likeness claims — avoid prompting for named individuals and review generated people for unintended resemblance. Second, generating people in contexts that imply endorsement (a face that looks like a celebrity appearing to use your product) is both a likeness and a false-endorsement risk. Third, deepfake and synthetic-media regulations are tightening — the EU AI Act and various national rules require disclosure of synthetic content depicting real people in certain contexts. The safe pattern for advertisers: use AI-generated people for generic, clearly-synthetic representation (a stock-style person using a product), keep provenance watermarking intact, avoid anything that could read as a specific real person or implied endorsement, and disclose synthetic content where regulation requires. For anything depicting a real person, get explicit rights — AI does not change the law on likeness.

How does AI creative fit with Performance Max and Demand Gen's own asset generation?

Google's own asset-generation features (the AI that suggests headlines, images, and video from your existing assets inside PMax and Demand Gen) and external tools like Veo, Runway, and Flux are complementary, and the right pattern uses both deliberately. Google's in-platform generation is convenient and free, good for quickly filling gaps in an asset group, but it gives you less control over brand specifics and concept and tends toward generic output. External tools give you full creative control — your brief, your references, your concept — at the cost of doing the production yourself. The pragmatic split: use external tools (Veo/Runway/Flux) to produce your considered, on-brand core assets and the high-volume variant set you actually want to test, and let Google's in-platform generation fill remaining gaps in the asset group rather than carry the creative load. Feed the algorithm your best AI-produced assets plus your hero, and let Google's generation backstop the long tail. See our [Performance Max asset strategy guide](/blog/performance-max-asset-groups-creative-strategy) for the asset-group structure.