Advertisement

Reddit is licensing its content to Google to help train its AI models

The deal is reportedly valued at $60 million a year.

SOPA Images via Getty Images

Google has struck a deal with Reddit that will allow the search engine maker to train its AI models on Reddit’s vast catalog of user-generated content, the two companies announced. Under the arrangement, Google will get access to Reddit’s Data API, which will help the company “better understand” content from the site.

The deal also provides Google with a valuable source of content it can use to train its AI models. “Google will now have efficient and structured access to fresher information, as well as enhanced signals that will help us better understand Reddit content and display, train on, and otherwise use it in the most accurate and relevant ways,” the company said in a statement.

Access to Reddit’s data became a hot-button issue last year when the company announced it would start charging developers to the use its API. The changes resulted in the shuttering of many third-party Reddit clients, and a sitewide protest in which thousands of subreddits temporarily “went dark.” Reddit justified the changes, in part, by saying that large AI companies were scraping its data without paying. In a statement, Reddit noted that the new arrangement with Google “does not change Reddit's Data API Terms or Developer Terms” and that “API access remains free for non-commercial usage.”

The deal comes as Reddit is expected to go public in the coming weeks. Neither Google or Reddit disclosed the terms of their arrangement but Bloomberg reported last week that Reddit had struck a licensing deal with a “large AI company” valued at “about $60 million” a year. That amount was also confirmed by Reuters, which was first to report Google’s involvement.