InstantID generates reproductions from a single face image

January 31, 2024

AI tools can create images of personalized digital identities but that involves fine-tuning LoRAs to get good results. InstantID is a zero-shot plugin that enables generative AI models to make consistent images using a single reference face image.

To get a generative model to create consistent coherent images of a specific person you generally need to use a LoRA.

LoRA, short for Low-Rank Adaptation, is a technique used to adapt image generation models, without fully retraining them. If you wanted to make your model really good at making images of Taylor Swift, you’d create a LoRA fine-tuned on a bunch of images of her.

Creating the LoRA takes time, a lot of reference images, and plenty of processing resources. InstantID changes all of that and could spell the end of LoRAs for a lot of applications.

The InstantX Team created InstantID, a zero-shot model that requires no training or fine-tuning. With a single face as a reference, InstantID can enable a text-to-image Stable Diffusion model like SD1.5 or SDXL to create more images of the person.

It uses an IdentityNet component that focuses strongly on specific facial elements in the reference image rather than other elements in the reference image.

One of the big benefits of InstantID is consistent character generation. Let’s say you wanted to generate images of a character in a game or graphic novel you were making. It’s extremely difficult to get an AI image generator to maintain consistency in the character’s facial features.

InstantID enables an AI image generator to maintain its stylistic and other generative functionality while maintaining high-fidelity facial features.

Examples of images generated from a single reference image. Source: arXiv

InstantID introduces huge risks too. LoRAs are a big feature on controversial sites like Civitai, where users would use them to create AI-generated porn. The site is littered with them, but it takes work and expertise to make a decent LoRA.

InstantID is likely to open the AI fake floodgates because you no longer need a LoRA, or access to loads of cloud computing power to create a realistic image of a specific person. One photo is all it takes.

In a case of unfortunate irony, the paper used Taylor Swift in a number of its example images. The flurry of fake NSFW Taylor Swift images that subsequently did the rounds this week is likely a sign of things to come.

The InstantX Team noted that InstantID enables “the potential creation of offensive or culturally inappropriate imagery.”

Eugene van der Watt

Eugene comes from an electronic engineering background and loves all things tech. When he takes a break from consuming AI news you'll find him at the snooker table.


