OpenAI says that it has a new tool in the works that will allow content creators to control how their content is being used to train generative AI. The tool, dubbed Media Manager, will enable creators and content owners to identify their works to OpenAI and specify how they want their content to be used for AI training and research.
The main aim is to offer this tool by next year, OpenAI says, as the company works with “creators, content owners, and regulators” toward a standard, maybe through the industry steering committee it recently joined forces with.
“This will require cutting-edge machine learning research to build a first-ever tool of its kind to help us identify copyrighted text, images, audio, and video across multiple sources and reflect creator preferences,” OpenAI wrote in a blog post. “Over time, we plan to introduce additional choices and features.”
Also Read: Microsoft Tightens Grip On Azure’s AI For Facial Recognition
OpenAI Has Been Criticised For Its AI Developing Approach
It looks like Media Manager, whatever the final product may be, is the company’s response to growing criticism of its approach to developing AI, which depends heavily on scraping publicly available data from the internet. Most recently, eight known US newspapers including the Chicago Tribune sued the company over copyright infringement related to OpenAI’s use of generative AI. The lawsuit accused the company of feeding articles to its AI model without any compensation or consent from the publications.
Generative AI models including OpenAI’s -- the kind of models capable of analyzing and generating text, images, videos, and much more -- are trained using several examples which are taken from publicly available data sets and websites.
OpenAI and other gen AI vendors argue that fair use of IP works to create a secondary creation as long as it’s transformative and protects their practice of using public data and feeding it to its model training.
Also Read: Microsoft Developing New AI Model To Rival Google, OpenAI
OpenAI Says It’s Impossible To Create AI Models With IP Material
OpenAI recently argued that it’s not possible to develop a useful AI model without any copyrighted material. But in an effort to answer critics and defend itself against future summons to the court, the company has taken measures to meet content creators in the middle.
In 2024, OpenAI enabled artists to “opt-out” of and remove their work from the data sets that the company uses to train its image-generating AI models. The company also enables the website owners to indicate through the robots.txt standard, which instructs the models about web-crawling bots and websites, whether content on their website can be scraped to train AI models.
The company continues to sign licensing deals with big content owners, including news organizations, stock media libraries, and other websites.