OpenAI sent ripples across the AI industry when it announced Sora, its video AI, earlier this year. However, the company still hasn’t admitted that the videos that were used to train the AI model were from YouTube.
During an interview at the Bloomberg Technology Summit, OpenAI’s COO Brad Lightcap talked about the potential business applications of artificial intelligence. And without a doubt, Sora is right there among the potential use cases. However, when questioned about whether or not YouTube videos were used to train OpenAI’s Sora, Lightcap did not give a clear option.
Also Read: Google CEO Sundar Pichai Lays Out Future AI Roadmap
Did OpenAI Use YouTube Videos To Train Sora?
When asked directly to “clear up once and for all” if the company uses YouTube to train its video AI Sora, Lightcap replied:
"Yeah, I mean look, the conversation around data is really important. We need to know where that data comes from. We just put out a post this week actually about this exact topic which is basically that there needs to be a content ID system for AI that lets creators understand as they create stuff where it’s going, who’s training on it, being able to opt in and out of training, being able to opt in and out of use. Also like, on the other side of that, being able to actively allow your content to be put into a model or to be accessed by a model because there may be this other economic opportunity on the other end of this.
And that’s something we’re exploring too, how do you go create an entirely different social contract with the web, with creators, with publishers, whereas these models go off in the world and do useful things, create value, to the extent they’re able to like, reference and incorporated content from the web, there should be some sort of way that people can kind of get a benefit from that.”
So, yeah, we’re looking at this problem, it’s really hard. We don’t have all the answers yet.”
It seems like an impressive non-answer, as Lightcap did not mention YouTube even once in the reply.
Also Read: New ASUS ROG Ally Handheld Console Coming Soon
OpenAI Confirms Sora’s Availability Timeline
Recently, the company published a post on “understanding the source of what we see and hear online.” The post also had no mention of YouTube, but in turn, discussed how OpenAI is working towards building a standard for content authenticity, alongside how it’s looking for new ways to identify content that was produced using OpenAI’s tools.
Earlier, it was reported that OpenAI used “over a million” hours of YouTube content to train GPT-4, though Google did the same for its Gemini as per the report. OpenAI has confirmed that Sora will be available later this year.