Behind the Curtain

15 specialized agents.
5 foundation models + a 126-ad corpus.
Three.js + Remotion 4.

Q: Is it just one big prompt under the hood?

No. A single Reeve render touches 12+ specialized prompts across 4 foundation models, plus a 3D rendering pipeline (Three.js + react-three-fiber + Blender-baked GLB models), an encode pipeline (H.264 with CRF 14 supersampling, ffmpeg lanczos downscale), and a voiceover pipeline (ElevenLabs). The orchestration code is what makes it feel seamless.

People hear “AI generates the video” and picture one model with one prompt. Reeve is not that. Below is the actual engineering: every agent, every foundation model, every layer of the rendering and encode stack — plus the 126-ad competitive corpus we read before we write your hook — by name, in the order they fire when you drop your URL into the box.

The flow

~30s end-to-end

01
URL
Input
02
DNA
Brand profile
Research
Rivals
Competitor scan
Research
Ad Library
Lookup matches
Research
Strategy
Pick template
06
Matrix
Copy · Image · Story
07
3D + Audio
Three.js · ElevenLabs
08
Render
Remotion · ffmpeg

01
URL
Input
02
DNA
Brand profile
Research
Rivals
Competitor scan
Research
Ad Library
Lookup matches
Research
Strategy
Pick template
06
Matrix
Copy · Image · Story
07
3D + Audio
Three.js · ElevenLabs
08
Render
Remotion · ffmpeg

The pipeline

Eight stages.
In order.

What actually happens between the moment you paste your URL and the moment the finished video shows up in your library. Every named agent, every foundation model, every encoding step.

01
Stage 01 · ~6s
Brand DNA — read the URL like a person would
The Brand DNA Profiler agent crawls your homepage, product pages, and about page. It extracts CSS variables for your signature palette, fingerprints your typography, harvests your product imagery, and reads your tone of voice. The Color Extractor isolates your dominant hue from both stylesheets and hero photography. The Image Classifier (Claude vision) tags every scraped image as hero, lifestyle, on-model, or packaging. The Review Widget Detector sniffs out Yotpo or Judge.me embeds and pulls real customer reviews.
Agents
- Brand DNA Profiler
- Color Extractor
- Image Classifier
- Review Widget Detector
Models / Stack
- Claude Sonnet (vision)
- Perplexity (grounding)
02
Stage 02 · ~5s
Strategy — pick the hook before writing the script
Two strategists run in parallel. The TikTok Strategist consumes the brand DNA plus your competitor ad library (scraped from Meta Ads Library, Google Ads Transparency, and TikTok Creative Center) and proposes 3 candidate hooks. For debate-format ads the Debate Strategist stages a 4–6 turn argument where one side is pre-determined to win — but both sides argue genuinely, with their own brand voice. Final hook + script is selected by the Magic Ad Director orchestrator.
Agents
- TikTok Strategist
- Debate Strategist
- Magic Ad Director
Models / Stack
- Claude Opus (reasoning)
- Claude Sonnet (drafting)
03
Stage 03 · ~9s
Imagery — preserve the product, restage the world
Reeve never invents your product. The pipeline picks a real product photo classified earlier, then runs it through OpenAI gpt-image-1 in image-to-image mode — the only widely-available foundation model that preserves product fidelity at this quality. For lifestyle backdrops and b-roll we use Google Imagen and Gemini 2.5 Flash Image for speed. Output is normalized to the target aspect (9:16, 1:1, 16:9) and color-graded against the brand palette before it ever touches the renderer.
Agents
- Image Classifier
- Magic Ad Director
Models / Stack
- OpenAI gpt-image-1
- Google Imagen 3
- Gemini 2.5 Flash Image
04
Stage 04 · —
3D scene — Three.js inside Remotion, with real meshes
Phone mockups, MacBook hero shots, packaging shots, and the Phone Corkscrew rotation are real 3D scenes — not video templates. Each scene is React Three Fiber wrapped in a Remotion composition. The iPhone, MacBook, and custom packaging meshes are Blender-baked GLB files with up to 8 PBR texture channels (albedo, normal, roughness, metalness, AO, emissive, displacement, opacity). The product image gets composited onto the screen via OffthreadVideo. Background scenes use blurred backdrops from the imagery stage above.
Agents
- Magic Ad Director
Models / Stack
- Three.js + react-three-fiber
- Blender-baked GLB meshes
05
Stage 05 · ~4s
Audio — script to voiceover to scored beat
The script and on-screen text feed into ElevenLabs with a voice ID matched to your brand persona. We never use OpenAI TTS for production output — ElevenLabs is materially better at ad-grade emotional pacing. The Sound Curator agent picks a trending sound from the TikTok Creative Center catalog that fits the brand vibe and ducks under the voiceover at the right amplitude.
Agents
- Sound Curator
Models / Stack
- ElevenLabs (voiceover)
- TikTok Creative Center catalog
06
Stage 06 · ~8s
Render — Chromium, WebGL, frames
Render workers run on AWS EC2 at videoapi.meetreeve.com. Each render boots a headless Chromium instance — the same browser engine you use — and lets Remotion 4 drive the composition frame-by-frame. WebGL is backed by ANGLE, the same translation layer Chrome uses on Windows. We render at 60fps native, supersampled, with H.264 + x264 at CRF 14 — visually lossless, larger files, deliberately preserved for downstream mastering.
Models / Stack
- Remotion 4
- Headless Chromium + ANGLE
- H.264 / x264 CRF 14
07
Stage 07 · ~2s
Captions + per-platform delivery
The Caption Writer agent ships per-platform copy. TikTok captions get the algorithm-friendly hashtag block. Reels captions truncate at 125 characters and lead with the hook. YouTube Shorts gets a longer description with timestamps. Each platform has its own truncation rules, hashtag conventions, and emoji placement guidance baked into the prompt.
Agents
- Caption Writer
Models / Stack
- Claude Haiku (per-platform variant generation)
08
Stage 08 · ~3s
Master + deliver — ffmpeg lanczos to final MP4
The CRF 14 render is mastered down through ffmpeg with the lanczos resampler for clean downsampling — same algorithm Pixar and Adobe use for film-quality scaling. 4K output for the free render and Pro plan; 1080p for Starter. Final MP4 lands in S3 and the share link is ready in your library.
Models / Stack
- ffmpeg lanczos
- AWS S3

The agents

Fifteen specialists,
not one generalist.

Each agent owns a single job — its own system prompt, its own evaluation rubric, its own model and temperature. Specialists outperform a monolithic prompt the same way a film crew outperforms one person with a phone.

15 agents · running in parallel

01Vision + reasoning
Brand DNA Profiler
Crawls the homepage, product pages, and about page. Extracts your color palette from CSS plus hero imagery, fingerprints typography, harvests product photography, and reads your tone of voice. The output is a structured DNA file every other agent consumes as a system prompt.
02Hook authoring
TikTok Strategist
Consumes the brand DNA plus competitor ad intel from Meta Ads Library, Google Ads Transparency, and TikTok Creative Center scraping. Proposes three candidate hooks per render, writes the script, and specifies on-screen text placement.
03Adversarial scriptwriting
Debate Strategist
Stages a 4–6 turn debate where one side is pre-determined to win — but both sides argue genuinely from their own brand voice. The format outperforms standard ad scripts on retention because the conflict is real, not synthetic.
04Per-platform copy
Caption Writer
Ships TikTok, Instagram Reels, and YouTube Shorts caption variants — each with the right truncation, hashtag conventions, and emoji placement for that platform's algorithm.
05Trending audio match
Sound Curator
Picks a trending sound from the TikTok Creative Center catalog that fits your brand tone, then specifies the duck-under amplitude for the voiceover mix.
06Orchestration
Magic Ad Director
The conductor. Picks which strategist to use, sequences the imagery + 3D + audio + render stages, handles retries when a sub-agent returns low-confidence output, and routes the final composition to the encoder.
07Vision tagging
Image Classifier
Reads every scraped product photo and tags it as hero, lifestyle, on-model, or packaging. Downstream stages query this index instead of guessing which image belongs in which slot.
08Palette mining
Color Extractor
Pulls signature brand colors from CSS variables, hero photography, and logo glyphs. Returns a structured palette that the renderer respects across every scene — your amber stays your amber.
09Social proof ingest
Review Widget Detector
Sniffs out Yotpo, Judge.me, and Reviews.io embeds on your site. The connectors then fetch real customer reviews so testimonial scenes use your actual voice — not invented quotes.
10Data fetcher
Yotpo Connector
If your site uses Yotpo, the connector authenticates with their API and pulls verified review text + ratings. Used by the testimonial scene composer.
11Data fetcher
Judge.me Connector
Same pattern as Yotpo — authenticated review fetch into structured JSON. Together with Yotpo this covers the majority of Shopify-based DTC review widgets.
12Market intel
Competitor Ad Scraper
ScreenPilot-driven scraping across Meta Ads Library, Google Ads Transparency, and TikTok Creative Center. Surfaces competitor hook patterns to the Strategist agents — so your hook doesn't accidentally rhyme with the ad you're trying to beat.
13Rival discovery
Competitor Finder
Identifies 3–5 named rivals from a brand URL using a multi-stage discovery pipeline: vertical classification → enumeration → validation. Multi-stage beats single-prompt because the model gets a chance to verify before it commits.
14Corpus retrieval
Ad Library Search
Queries Reeve's 126-ad mobile UA corpus by vertical + hook formula to surface the patterns competitors are running right now. Returns ranked ad cards with frame teardowns, hook type, and recreatability scores.
15Template selection
Strategy Decider
Reads the brand DNA, the matched competitor ads, and the 8 winning formula specs, then picks the template + register that wins for THIS brand. Output steers the matrix agents — copywriter, image-finder, storyboarder — toward the chosen format.

Foundation models

No single model could do this.
So we use the right one for each job.

Vendor lock-in is a liability when your output quality depends on the frontier. Reeve routes each stage to the model that's best at it today — and swaps when something better ships.

Anthropic: Claude Opus
Claude Sonnet
Claude Haiku; Primary reasoning, vision, scriptwriting, and per-platform caption generation. Opus handles strategy; Sonnet drafts and tags imagery; Haiku ships fast caption variants.
OpenAI: gpt-image-1; Image-to-image product photography. The only widely-available model that preserves real product fidelity at ad-grade quality — which is why we shipped it to production despite ~3× the cost of Gemini.
Google: Imagen 3
Gemini 2.5 Flash Image
Gemini 2.5 Pro; Lifestyle imagery, b-roll, and fast iteration on generated backdrops. Cheaper and faster than gpt-image-1 — used wherever the product itself is not the subject.
ElevenLabs: Multilingual v2
Voice library; Production voiceovers. Voice IDs are matched to your brand persona — onyx maps to a Brian voice, et cetera. We do not use OpenAI TTS for production; ElevenLabs is materially better at ad-grade emotional pacing.
Brave Search: Image API
Web API; Independent web index for brand image discovery and competitor intel. Bypasses bot-walls on retailer sites that block direct scraping (Kiehl's, Aesop, Ulta) — Brave already crawled them, so we ride the index instead of fighting Cloudflare.
Tavily: Search API; Agent-native search for high-signal brand sources (official sites, editorial press, beauty trade publications). Returns image URLs alongside web results — better signal-to-noise than generic search for DNA enrichment.
Perplexity: Sonar Pro; Search-grounded brand research. When the homepage doesn't surface enough context (small DTC brands, niche B2B), Perplexity grounds the DNA build with live web search and citations.
ScrapeCreators: Meta Ad Library API; Pulls live competitor creative from Meta's Ad Library across 18+ top advertisers (Royal Match, Bumble, Calm, Temu, etc.) and 8 verticals. Refreshes Reeve's 126-ad corpus weekly so the strategy agent always reads what's actually running right now.
Reeve Ad Corpus: 126 ads · 11,684 frames
18 advertisers · 8 verticals
8 winning formulas; Reeve's in-house competitive intelligence: 126 mobile UA ads teardown'd at 0.5s frame granularity, tagged by hook formula, vertical, and recreatability. The strategy agent queries this corpus before it picks your template — every Reeve render is informed by what's already winning.

Routed through internal model gateway · health-aware fallback · per-tenant key isolation

3D + Render

Real meshes.
Real WebGL.
Real ffmpeg.

The phone in your ad isn't a stock template — it's a 3D mesh with multi-channel PBR materials, lit and rendered per-frame. The encoder isn't a black box — it's CRF 14 H.264 with a lanczos master. Every step is auditable.

Composition

Remotion 4

React-based video composition. Every scene is a JSX tree with timeline-aware components. Frames are drawn deterministically — same input, same output, every time. This is what makes re-do's with feedback actually work: the model can re-render with one prop changed.

3D layer

Three.js + react-three-fiber

Three.js for the WebGL primitives, react-three-fiber for the declarative scene graph. The iPhone Hero, Phone Corkscrew, MacBook, and custom packaging shots are all 3D scenes — not video templates. Camera moves, lighting rigs, and mesh rotations are all parameterized.

Geometry

Blender-baked GLB

iPhone, MacBook, and custom packaging meshes are sculpted in Blender, UV-unwrapped, baked, and exported as GLB. Up to 8 PBR texture channels per mesh — albedo, normal, roughness, metalness, ambient occlusion, emissive, displacement, opacity.

Image surfaces

OffthreadVideo + img-to-img

Product images are composited onto the 3D screen surfaces using OffthreadVideo so playback stays smooth even when 4K still imagery sits on a curved phone screen. Backdrops are blurred and color-graded against the brand palette before they reach the renderer.

GPU

Headless Chromium + ANGLE

The render workers boot a headless Chromium instance — same browser engine you're reading this in. WebGL is backed by ANGLE, the OpenGL-to-system-graphics translation layer Chrome ships on Windows. This means the preview you see in the browser and the final render are pixel-equivalent.

Encode

H.264 / x264 · CRF 14 · supersampled

60fps native render at 2× supersampling. H.264 with x264 at CRF 14 — visually lossless, intentionally large. We keep the master fat so the next stage has room to work.

Master

ffmpeg · lanczos

The supersampled CRF 14 master is downscaled to delivery resolution with the lanczos resampler — the same algorithm used in film mastering. 4K for free renders and Pro plan, 1080p for Starter. Aliasing artifacts that survive cheaper resamplers are gone.

Infra

AWS EC2 · S3 · RDS

Render workers on EC2 at videoapi.meetreeve.com. Brand profiles and render history in RDS PostgreSQL. Final video artifacts in S3 with signed share-link URLs. Auth0 for sessions, Stripe for billing, Vercel for the Next.js frontend.

See it live

Don't take our word for it.
Watch it work on your URL.

Paste your domain. We'll do the brand-DNA scan, find your competitors, surface what they're running, pick the template that wins, and ship the ad. Live, in under 90 seconds. First one's free.

Try it on my URLno signup · 1 free render per IP

By the numbers

Specifics, on the record.

15
Specialist agents: 7
Models + search APIs: 126
Competitive ads: 11,684
Frames teardown'd: 8
Winning formulas: 48%
Recreatable share: 8
PBR texture channels: 60fps
Native render rate: CRF 14
Master encode: 30s
End-to-end render

Frequently asked

Quick answers,
no hand-waving.

Does Reeve look at competitor ads when generating mine?

Yes. Reeve continuously scrapes the Meta Ad Library across 18+ top advertisers (Royal Match, Bumble, Calm, Temu, etc.) and 8 verticals. When you paste your URL, the strategy agent reads what your closest competitors are actually running right now — UGC skits, fail-state hooks, direct-product spotlights — and picks the format that's winning for brands like yours, then steers the matrix agents toward that template. You see every step on screen as it happens.

Which AI generates the videos?

Reeve doesn't use one model — it orchestrates a dozen. Anthropic Claude (Opus, Sonnet, Haiku) handles brand reasoning, scripting, and vision. OpenAI gpt-image-1 generates product photography from your real assets. Google Gemini Imagen and Gemini 2.5 Flash Image cover lifestyle and background generation. ElevenLabs handles voiceovers. Final video composition runs on Remotion 4 with Three.js for 3D scenes.

How does Reeve handle product imagery?

We never make up your product. Reeve scrapes the actual photography on your site, classifies it (hero, lifestyle, on-model, packaging) with a vision model, then uses image-to-image generation through OpenAI gpt-image-1 to extend it — preserving the real product while restaging the scene. Output passes through ffmpeg lanczos for clean downscaling to delivery resolution.

Can it match my brand voice?

Yes. The Brand DNA Profiler agent reads your homepage, about page, and product pages, then extracts your tone, vocabulary, sentence rhythm, and signature phrases. The TikTok Strategist and Caption Writer agents both consume that DNA file as a system prompt — so the script and captions sound like you, not like generic AI copy.

Why so many agents instead of one big prompt?

Single-prompt video generation collapses under constraint count. Brand voice, platform format, hook structure, on-screen text rules, voiceover pacing, and 3D scene composition each have hard rules that conflict in subtle ways. We isolate each into a specialist agent with its own system prompt, evaluation rubric, and temperature setting. Specialist agents outperform a monolithic prompt the same way a film crew outperforms one person with a phone.

Is it just one big prompt under the hood?

No. A single Reeve render touches 20+ specialized prompts across 7 stack providers (Anthropic, OpenAI, Google, ElevenLabs, Brave, Tavily, Perplexity), plus a 3D rendering pipeline (Three.js + react-three-fiber + Blender-baked GLB models), an encode pipeline (H.264 with CRF 14 supersampling, ffmpeg lanczos downscale), and a voiceover pipeline (ElevenLabs). The orchestration code is what makes it feel seamless.

What's the difference vs ChatGPT video or Sora?

ChatGPT video and Sora are general-purpose text-to-video models. Reeve is a verticalized brand-ad pipeline: it ingests your real URL, scrapes your real product photos, matches your real brand colors and voice, and renders a finished, captioned, voiced 9:16 ad ready for TikTok, Reels, or Shorts in 30 seconds. The output is deterministic where it needs to be (your logo, your product, your colors) and creative where it should be (script, scene direction, b-roll).

How does Reeve do 3D phone mockups?

Three.js and react-three-fiber inside a Remotion composition. The iPhone, MacBook, and packaging meshes are Blender-baked GLB files with multi-channel PBR texture maps — albedo, normal, roughness, metalness, ambient occlusion. The Chromium headless renderer uses ANGLE-backed WebGL on the EC2 render workers; final composites are OffthreadVideo + img-to-img surfaces with blurred backdrops, encoded at 60fps native then mastered down via ffmpeg lanczos.

Does Reeve write the captions and pick the trending sound?

Yes — both. The Caption Writer agent ships per-platform copy (TikTok, IG Reels, YouTube Shorts) with the right hashtag conventions and truncation rules for each platform's algorithm. The Sound Curator agent picks a trending sound that fits your brand tone from the TikTok Creative Center catalog.

What infrastructure does Reeve run on?

AWS end-to-end: EC2 for the headless Chromium render workers, RDS PostgreSQL for brand profiles and render history, S3 for asset storage and final video artifacts. Auth0 for authentication, Stripe for billing, Vercel for the Next.js frontend. Render endpoint is videoapi.meetreeve.com.

Do I own what Reeve makes?

Yes. You own every render — including the free trial render. No watermark on any plan. The brand DNA file is yours; you can export it. Cancel anytime and your library stays accessible for 90 days.

See it run

Drop your URL. Watch the stack work.

Every component above runs on your free render. No credit card. No watermark. Thirty seconds.

Make my ads See pricing →

15 specialized agents.5 foundation models + a 126-ad corpus.Three.js + Remotion 4.

Eight stages.In order.

Brand DNA — read the URL like a person would

Strategy — pick the hook before writing the script

Imagery — preserve the product, restage the world

3D scene — Three.js inside Remotion, with real meshes

Audio — script to voiceover to scored beat

Render — Chromium, WebGL, frames

Captions + per-platform delivery

Master + deliver — ffmpeg lanczos to final MP4

Fifteen specialists,not one generalist.

Brand DNA Profiler

TikTok Strategist

Debate Strategist

Caption Writer

Sound Curator

Magic Ad Director

Image Classifier

Color Extractor

Review Widget Detector

Yotpo Connector

Judge.me Connector

Competitor Ad Scraper

Competitor Finder

Ad Library Search

Strategy Decider

No single model could do this.So we use the right one for each job.

Real meshes.Real WebGL.Real ffmpeg.