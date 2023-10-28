Artists who want to share their artwork often face a tough choice: keep it offline or post it on social media and risk having it used to train data-hungry AI image generators.

But a new tool may soon be able to help artists stop AI companies from using their artwork without permission.

It’s called “Nightshade” and was developed by a team of researchers at the University of Chicago. According to MIT Technology Review, it works by “poisoning” an artist’s creation by subtly altering the pixels of the image so that AI models are no longer able to accurately determine what the image represents.

While the human eye is not able to detect these small changes, they are intended to cause the machine-learning model to mislabel the image as something other than what it is. Since these AI models rely on precise data, this “poisoning” process will essentially render the image useless for training purposes.

If too many of these “poisoned” images are removed from the web and used to train an AI image generator, the AI ​​model itself may not be able to produce accurate images.

For example, researchers fed Stable Diffusion, an AI image generator, and an AI model that created 50 “poisonous” images of dogs themselves, then asked it to create new photos of dogs. According to MIT Technology Review, the images generated showed animals with multiple limbs or cartoon-like faces, somewhat resembling a dog.

After researchers fed the Stable Diffusion 300 “poisonous” images of dogs, it eventually started producing images of cats. Stable Diffusion did not respond to CNBC Make It’s request for comment.

On the surface, AI art generators appear to create images out of thin air based on a prompt given by someone.

But helping these generative AI models create realistic-looking images of pink giraffes or underwater castles isn’t magic — it’s training data, and a lot of it.

AI companies train their models on huge sets of data, helping the models determine which images are associated with which words. In order for an AI model to correctly generate an image of a pink giraffe, it would need to be trained to correctly identify images of giraffes and the color pink.

Much of the data used to train many generative AI systems is extracted from the web. Although it is legal in the US for companies to collect data from publicly accessible websites and use it for a variety of purposes, it becomes complicated when it comes to works of art as artists usually own their pieces. They own the copyrights and sometimes don’t want their art to be used. Used to train AI models.

While artists can sign up to “opt-out lists” or “do-not-scrape instructions”, it is often difficult to force companies to comply with them, glazing in uchicagothe team of researchers who created Nightshade said in a thread on Oct. 24 on X, formerly known as Twitter.

“None of these mechanisms are enforceable or even verifiable. Companies have shown that they can ignore opt-outs without thinking,” he said in an October 24 thread. “But even if they agree but act otherwise, no one can verify or prove it (at least not today). These tools are toothless.”

Ultimately, researchers hope nightshade will help artists protect their art.

The researchers have not yet released their Nightshade tool to the public, but they have submitted their work for peer review and hope to make it available soon, Glaze at UChicago said on X.

