← Back to blog

How viral AI video effects actually work

Every viral AI video effect this year — cakeify, squish, dancing figurines — runs on one four-step pipeline. Learn the loop once and ship any new trend.

You scroll past three new AI video effects before lunch. Each one looks like a different app — a cakeify app, a squish app, a Ghibli app. None of them exist. They're all the same four steps in a row, and once you can see the loop, you can cook any new trend the same afternoon it lands.

Highlights

  • Every viral AI video effect — cakeify, squish, Ghibli, dancing figurines — runs on the same image-to-video pipeline, not a different tool each time.

  • The skill isn't the model. It's the start frame and the transformation verb. A clean still with one subject animates clean; a busy frame turns to mush halfway through.

  • Motion control is what makes the figurine trend scale: one dance reference clip + your character image = thousands of near-identical clips in a single week.

Open Instagram or TikTok this week and your feed is wallpapered with the same handful of AI video effects. A knife slices a sneaker and reveals sponge cake inside. A bell pepper slowly compresses like memory foam. A boxed action figure of someone you follow climbs out of its pack and starts dancing. The names change every week — cakeify, squish, inflate, the figurine dance — but the workflow does not.

The trick is that these aren't tools. They're recipes. Four ingredients, always in the same order: a start image, an image-to-video model (Kling 2.6 Pro, Runway Gen-4, Hailuo 2.3, Veo 3, Magnific — pick one), a tight transformation prompt, and an upscale pass before posting. The viral name is just packaging wrapped around the loop.

Sources: Raxxo's 2026 viral-effects breakdown, and last week's recipe at Make your AI action figure look real.

The following images were generated using Nano Banana 2:

product squish, seamless studio, bright high-key, white + glossy red + coral — single hand pressing a red bell pepper flat on a marble board, foam-like deformation

How to ship any viral AI video effect

  1. Start with one clean subject on a plain background. Every flaw in your start frame gets amplified the second the clip starts moving. A tiny artifact on a still becomes a flickering mess in motion. Generate four options, pick the cleanest, and frame it vertically from the very first render — phones cap the action in the middle third, and a perfect reveal that happens behind a follow button is the worst kind of wasted clip.

  2. Use a physical verb the model actually understands. "Slice." "Compress." "Inflate." "Crush." "Melt." Vague verbs like "transform" or "change into" give vague results. The transformation needs a real-world action with weight behind it — that's what cues the model to apply physics instead of a generic morph.

  3. Lock the prompt skeleton — subject, transformation, camera, physics word. Same order every time. For cakeify: "a realistic [object] on a plain plate, a knife slices through it revealing moist sponge cake layers inside, slow motion, macro, satisfying." For squish: "a [object] made of soft memory foam, a hand presses down and it compresses then slowly springs back, studio light, macro." Swap the object, keep the structure.

  4. Pull the motion-amount slider on purpose. Most tools default it to medium. Crank it for a dramatic transformation, pull it back for a subtle one — but never leave it on default. That's the first slider to touch, before you even fix the prompt. It's also why two creators running the same prompt get wildly different clips.

  5. Render three times, upscale the winner. The model is a slot machine. One pull in four is good, two are usable, one is cursed. Run the same prompt with fresh seeds, pick the best take, then upscale to 4K in Magnific before posting. Every platform recompresses your upload on the way in, and a soft clip turns to porridge in the feed. A crisp 4K source survives the squeeze.

Worked example A: cakeify a wristwatch

The "is it cake" effect has been viral on and off since 2022, and the AI version is the cleanest payoff yet. The trick is the start frame — a hero shot of one object, centered, on a plain backdrop. Then describe the slice in physical terms.

product cakeify variant, seamless studio, bright high-key, white + chestnut + cream — luxury leather wristwatch sliced revealing vanilla sponge with strawberry jam, marble surface

Prompt:

A luxury silver wristwatch on a plain marble surface, soft studio light, hero shot. A small kitchen knife slowly slices through the watch face diagonally, the cross-section revealing moist vanilla sponge cake layers with thin strawberry jam between them, leather strap intact on either side. Slow motion. Macro lens. Satisfying. Sharp focus. No motion blur on the knife.

One subject. Clean background. Real-world verb ("slices"). Camera direction ("macro," "slow motion"). Physics cue ("satisfying"). Run the seed three times, pick the cleanest take, upscale before export.

Worked example B: make the figurine dance

This is the most technically interesting effect of 2026 because it has two stages. First, generate the boxed figure (we covered that recipe last week). Then use motion control — Kling's headline feature this year — to copy a dance reference clip onto your character. That's how a single dance take fuels thousands of near-identical clips in a single week.

product figurine motion, seamless studio, bright high-key, mint+coral+yellow pastel gradient — vinyl collectible action figure of a 20s woman mid-dance with blister pack ripped open behind

Prompt (for the motion stage, after the figurine image is locked):

Reference video: a short clip of a person doing a simple two-step dance, full body, centered. Character image: boxed action figure of the same person. Generate: the action figure breaks out of the blister pack and performs the dance from the reference clip, keeping the figure's plastic look and joint articulation. 3-second loop. Studio light. Macro.

Two ingredients: the reference and the character. The model handles the in-between. Render three times, upscale the winner.

Viral effects expire in days, not weeks. The creators who win are not the ones with a secret app — they're the ones who recognize the recipe the morning it lands, batch five clips while the trend is still fresh, and move on before it goes stale. Learn the four-step loop once and every new effect becomes a variation you already know how to cook. Sit on it and the next two trends will be over before you've shipped your first attempt.

The bottom line

The viral AI video effect of the week is never a new app. It's the same four steps — start image, transformation prompt, motion control, upscale — wrapped in new packaging. The creators who keep landing on the For You page have the loop memorized, and they're already on the next trend by the time their last one peaks. If you don't have a face you can drop into every effect, that's the first thing to fix. Build your AI avatar in 90 seconds.

FAQs

Q: Which model should I use for viral AI video effects?

A: For most transformations — cakeify, squish, inflate — Kling 2.6 Pro and Runway Gen-4 both handle physics cleanly. For motion control (the figurine dance), Kling's motion-control feature is what creators are leaning on this year. Magnific is the all-in-one if you don't want to shuttle clips between four tools.

Q: How long should the clip be?

A: Three to five seconds is the sweet spot. The transformation is the whole point — land on it fast and loop clean. Longer clips give the model room to wander, the object drifts, hands sprout extra fingers. If you need more time, cut two short takes together instead of asking for one long render.

Q: Why does my effect look fine in the model but bad on Instagram?

A: Platform recompression. Every social platform re-encodes your upload aggressively, and a soft 1080p clip turns to porridge in the feed. Upscale to 4K before posting — even though the platform will downscale it back, the crisp source survives the squeeze.

Q: Do I need a new prompt for every trend?

A: No. The prompt skeleton — subject, transformation, camera, physics word — covers most effects. Swap the object and the transformation verb; keep the structure. Save your three best skeletons and reuse them.

Q: What's the single biggest mistake people make?

A: Skipping the start frame. A busy background or a soft subject ruins the take before the model starts moving. One clean subject, plain background, vertical framing — that's the whole game on the input side.

Newsletter