Microsoft AI CEO: All About Open Internet, Fair Use for Training

Microsoft AI CEO: All About Open Internet, Fair Use for Training

To write, run promoting campaigns and do extracurricular activities, AI needs training materials. About ChatGPT needed 300 billion words to get going and continues to learn from users’ interactions with the system.

But humans don’t get credit or compensation for creating the content that AI gobbles up. Authors, artists, and news organizations have filed countless copyright lawsuits against AI giants like OpenAI and Microsoft because they’ve discovered that AI bots can talk about their copyrighted works “too accurately” — indicating that those works are in the AI’s training data.

- Advertisement -

That is why Mustafa Suleyman, Microsoft’s CEO for Artificial Intelligence, was asked at the conference, Aspen Ideas Festival at the end of June, artificial intelligence corporations essentially stole the world’s mental property.

Suleyman’s response? Almost all content on the web, with one possible exception, is fair game for training AI.

“I think when it comes to content that’s already out there on the open web, the social contract around that content since the 1990s is that it’s fair use,” Suleyman said.

Suleyman said “anyone” can copy or reproduce content on the open web.

“It was a highway,” he said. “It was understanding.”

However, some news sites and publishers have requested that their data not be downloaded or indexed.

“It’s a grey area and I think the matter will be reflected in the courts,” Suleyman said.

Mustafa Suleyman. Photographer: Stefan Wermuth/Bloomberg via Getty Images

Suleyman leads Microsoft’s AI division at a time when Microsoft was investing billions to technology. His stance on what is and isn’t fair use explains how AI corporations can defend themselves against mental property allegations in court.

For example, OpenAI allegedly used over a million hours of YouTube videos to coach ChatGPT. When he asked Whether OpenAI’s Sora video generator was created using YouTube or social media videos, the company’s chief technology officer Mira Murati said, “We used publicly available and licensed data,” and didn’t provide further details.

AI also appears to be absorbing work generated by other AIs, resulting in lower-quality results. Experts estimate that 90% of online content shall be generated using artificial intelligence in the next two years.

Related: The most downloaded news app in the US may have published dozens of faux stories written by AI

Latest Posts

Advertisement

More from this stream

Recomended