Nvidia has come a long way since the horror of the infamous AI generated pictures of cats a few years back. For the European Conference on Computer Vision (ECCV) this year, Nvidia has pulled up with significant advancements for it's open-source image translator: The COCO-FUNIT, or—if you're after a tongue twister—the Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder.
'Our architecture achieves photo-realistic translation', the accompanying paper claims.
The still developing tech allows you to merge images together to create new and fantastic, chimeric cross-breeds of animals (and apparently motor bikes). It's previous iteration, as demonstrated in the GANimal (opens in new tab) application, was apparently infused with some nightmarish existential torment tech... fair warning, creepy example coming up.
The general idea, if you weren't aware, is that you feed it an input (content) image so the AI can apply different skins (styles) to it, while still preserving the pose (structure) of the original image. So you could apply a polar bear skin to your pug or make your canary resemble the mighty eagle. It seems to work best for blending animal friends, but it’s much more entertaining to feed it human faces and have the AI turn your colleagues into furry abominations.
The original baseline FUNIT architecture (the tech that generated the above monstrosity) had been running into a lot of ‘content loss’ issues, as you can probably tell. Animals merging into the background, inconsistencies with scaling and alignment of body parts all combine to offend your senses and drag you into the abyss of the uncanny valley.
The more recent iteration of the tech, however, addresses these issues with its ‘novel network architecture’, the content-conditioned style encoder. It ‘takes both content and style image as input’ meaning more information for the AI to work with, resulting in less anomalies and more life-like results.
Though still teetering on the uncanny ridge, the newer tech seems to be a scratch above the amorphous blobs and faceless nightmares of yesteryear. If you want to learn more about the latest advancements, grab yourself a strong coffee and take a look at the COCO-FUNIT (opens in new tab) (pdf warning) that outlines some more of the theory and methods behind the architecture. The open source code is coming soon, so look out for more strange iterations across the web.