Google researchers find novel way of turning a single photo of a human into AI-generated video good enough to make you think 'this might go badly'
Don't think about the repercussions. Don't think about the repercussions. Don't think about the repercussions.
Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.
You are now subscribed
Your newsletter sign-up was successful
Want to add more newsletters?
Every Friday
GamesRadar+
Your weekly update on everything you could ever want to know about the games you already love, games we know you're going to love in the near future, and tales from the communities that surround them.
Every Thursday
GTA 6 O'clock
Our special GTA 6 newsletter, with breaking news, insider info, and rumor analysis from the award-winning GTA 6 O'clock experts.
Every Friday
Knowledge
From the creators of Edge: A weekly videogame industry newsletter with analysis from expert writers, guidance from professionals, and insight into what's on the horizon.
Every Thursday
The Setup
Hardware nerds unite, sign up to our free tech newsletter for a weekly digest of the hottest new tech, the latest gadgets on the test bench, and much more.
Every Wednesday
Switch 2 Spotlight
Sign up to our new Switch 2 newsletter, where we bring you the latest talking points on Nintendo's new console each week, bring you up to date on the news, and recommend what games to play.
Every Saturday
The Watchlist
Subscribe for a weekly digest of the movie and TV news that matters, direct to your inbox. From first-look trailers, interviews, reviews and explainers, we've got you covered.
Once a month
SFX
Get sneak previews, exclusive competitions and details of special events each month!
Google researchers have found a way to create video versions of humans generated from just a single still image. This enables it to do things like, generate a video of someone speaking from input text, or changing a person's mouth movements to match an audio track in a different language to the one originally spoken. It also feels like a slippery slope into identity theft and misinformation, but what's AI if not with a hint of frightening consequences.
The tech itself is rather interesting: it's called Vlogger by the Google researchers that published the paper. In it the authors (Enric Corona et al) offer up various examples of how the AI takes a single input image of a human—in this case, I believe mostly AI-generated humans—and with an audio file produces both facial and bodily movements for them to match.
That's just one of a few potential use cases for the tech. Another is editing video, specifically a video subject's facial expressions. In an example, the researchers show various versions of the same clip: one has a presenter speaking to camera, another with the presenter's mouth closed in an eerie fashion, another with their eyes closed. My favourite is the video of the presenter with their eyes artificially held open by the AI, unblinking. Huge serial killer vibes. Thanks, AI.
The most useful feature in my opinion is the ability to swap an audio track for a video with a dubbed foreign language version and have the AI lip-sync the person's facial movements to the audio track.
It works through the use of two stages: "1) a stochastic human-to-3d-motion diffusion model, and 2) a novel diffusion based architecture that augments text-to-image models with both temporal and spatial controls. This approach enables the generation of high quality videos of variable length, that are easily controllable through high-level representations of human faces and bodies," the GitHub page says.
Admittedly the tech isn't perfect. In the examples given the mouth movements have certain qualities common across AI-generated video content. It's also pretty creepy at times, as noted by users responding to a thread about the technology by EyeingAI on X. But Vlogger doesn't need to fool everyone, or even fool anyone at all, to have some use. Similarly, if it were a more perfect technology, it'd be even more worrying to think about how this technology could be used to create deep fakes, spread misinformation, or steal identities. We'll get there one day, and I for one hope we have some handle on how to deal with this stuff a bit more by then.
Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.

Jacob earned his first byline writing for his own tech blog, before graduating into breaking things professionally at PCGamesN. Now he's managing editor of the hardware team at PC Gamer, and you'll usually find him testing the latest components or building a gaming PC.

