Using Ref-to-Video With Vidu
Disclosure: I'm in the Vidu Artist's Program
This is a long one, have a listen while you read:
One of the ongoing limitations of genAI is consistency. Image prompting and img-2-vid start frames are helpful, but there's always a part of the character not shown that the AI had to interpolate.
Vidu addresses this problem through the use of ref-to-video/ref-to-image.
Either a single-use image or a profile of up to three images and a descriptive prompt take the place of a noun in your video prompt. It's that thing I've wanted since 2023! Neat!
This method was used to make the vast majority of my Radio Free Ultramerica videos.
I've covered how to use the basic functions of ref-to-vid in other posts. So this will be more about how to make a good character reference.
Quick Tips:
Keep your character prompts backed up in a separate document. Whenever you change the images in the Create Reference menu, the system auto-generates an updated prompt, so backing up your edited ones is essential.
Avoid transparent PNGs. If there's any white haloing around the clipping, it can cause artifacts.
More reference is good reference. A single reference image can work depending on the project, but ideally, you want at least a front view, a side or back view, and a face closeup
Fortunately, if we've got one piece of reference, we can use it to make others. So, lets build us a monster. A robot monster! (no, not that one)
This is my Thanorite, a water heater with legs I doodled for a pulp RPG campaign that never quite materialized. Sadly I never scanned his raw lines and that black is a pain to remove manually, so it's off to (shudder) ChatGPT to get an animation cell style interpretation. As loathe as I am to admit it, ChatGPT/Sora is currently the best at 'give me this in this style' without massive reinterpretations.
It also, however, sees through an amber-colored lens. Everything ChatGPT generates is more yellowed and darker than it ought to be, so it's off to photoshop. And while I'm at it, lets get a more interesting colors scheme and restore the plasma bubbles, and remove his gun, since asymmetry of limbs requires a lot more trial-and-error.
Next, I bring the Thanorite into Vidu, and set up a ref-2-vid instructing the character to turn around, and advising what their back should look like. I'm doing this as a video rather than an image because vidu's ref-2-img strongly biases the input image, and is reluctant to turn characters around in its current version.
That part about 'full body, head and feet fully visible'? Very important. Otherwise it will tend toward a mid-closeup.
Vidu's got a built-in frame-cap for just this purpose, so I yank some frames, clean them up, and we're good to go:
Tutorial continues under fold:



















