OpenAI is fighting lawsuits from artists, writers, and publishers who allege it inappropriately used their work to train the algorithms behind ChatGPT and other AI systems. On Tuesday the company announced a tool apparently designed to appease creatives and rights holders, by granting them some control over how OpenAI uses their work.
The company says it will launch a tool in 2025 called Media Manager that allows content creators to opt out their work from the company’s AI development. In a blog post, OpenAI described the tool as a way to allow “creators and content owners to tell us what they own” and specify “how they want their works to be included or excluded from machine learning research and training.”
OpenAI said that it is working with “creators, content owners, and regulators” to develop the tool and intends it to “set an industry standard.” The company did not name any of its partners on the project or make clear exactly how the tool will operate.
Open questions about the system include whether content owners will be able to make a single request to cover all their works, and whether OpenAI will allow requests related to models that have already been trained and launched. Research is underway on machine “unlearning,” a process that adjusts an AI system to retrospectively remove the contribution of one part of its training data, but the technique has not yet been perfected.
Ed Newton-Rex, CEO of the startup Fairly Trained, which certifies AI companies that use ethically-sourced training data, says OpenAI’s apparent shift on training data is welcome but that the implementation will be critical. “I’m glad to see OpenAI engaging with this issue. Whether or not it will actually help artists will come down to the detail, which hasn’t been provided yet,” he says. The first major question on his mind: Is this simply an opt-out tool that leaves OpenAI contining to use data without permission unless a content owner requests its exclusion? Or will it represent a larger shift in how OpenAI does business? OpenAI did not immediately return a request for comment.
Newton-Rex is also curious to know if OpenAI will allow other companies to use its Media Manager so that artists can signal their preferences to multiple AI developers at once. “If not, it will just add further complexity to an already complex opt-out environment,” says Newton-Rex, who was formerly an executive at Stability AI, developer of the Stable Diffusion image generator.
OpenAI is not the first to look for ways for artists and other content creators to signal their preferences about use of their work and personal data for AI projects. Other tech companies, from Adobe to Tumblr, also offer opt-out tools regarding data collection and machine learning. The startup Spawning launched a registry called Do Not Train nearly two years ago and creators have already added their preferences for 1.5 billion works.
Jordan Meyer, CEO of Spawning, says the company is not working with OpenAI on its Media Manager project, but is open to doing so. “If OpenAI is able to make registering or respecting universal opt-outs easier, we’ll happily incorporate their work into our suite,” he says.
Like Newton-Rex, though, Meyer worries that the creation of many different opt-in systems will be too onerous for artists and creators. “A proliferation of disparate opt-out tools built by AI giants is exactly what we need to avoid,” he says. “Opting out should be simple and universal, which we believe requires an open system built by a third party.”
Past attempts by major AI developers to provide people more control over how their data is used haven’t always run smoothly. Last year, Meta launched a way for people who didn’t want their personal data used to train Meta’s AI to request it be deleted. Many artists interpreted this as way to ask for their work to be opted-out of Meta’s AI projects, and were frustrated when the company declined to process their requests. (Meta, for its part, told WIRED that it was not an opt-out feature.)
Meanwhile, a rising movement that objects to AI’s approach to training data is advocating for a far more radical change: Switching to a regime where AI companies only train algorithms on data with explicit permission from creative and rights holders. “When it comes to companies that are looking to turn massive profits and to disrupt industries, I don’t think opt-out works at all,” says concept artist and illustrator Reid Southen, who often writes about AI and art. “Opt-in is the only feasible way forward.”
Southen and Newton-Rex both say that opt-out tools can put an undue burden on creatives, especially if these tools require them to submit requests for each individual work they want to exclude from training. “Imagine if you’re a photographer with thousands and thousands of images,” Southern says. “There’s no way.”
Source : Wired