Why you can trust us

Engadget has been testing and reviewing consumer tech since 2004. Our stories may include affiliate links; if you buy something through a link, we may earn a commission. Read more about how we evaluate products.

Google fights deepfakes by releasing 3,000 deepfakes

Researchers can use the database to make automated detection tools better at spotting doctored videos.

Google has released a pretty huge dataset of deepfake videos in an effort to support researchers working on detection tools. Deepfake videos look and sound so authentic, they could be used for highly convincing disinformation campaigns in the upcoming elections. They could also cause countless issues for individuals like celebrities whose faces can be used to create fake pornographic videos that look authentic.

The tech giant filmed actors in a variety of scenes and then used publicly available deepfake generation methods to create a database of around 3,000 deepfakes. Researchers can now use that dataset to train automated detection tools and make them as effective and accurate as possible when its comes to spotting AI-synthesized images. Google promises to add more videos to the database in hopes that it can keep up with rapidly evolving deepfake generation techniques. The company said in its announcement:

"Since the field is moving quickly, we'll add to this dataset as deepfake technology evolves over time, and we'll continue to work with partners in this space. We firmly believe in supporting a thriving research community around mitigating potential harms from misuses of synthetic media, and today's release of our deepfake dataset in the FaceForensics benchmark is an important step in that direction."

Google isn't the only tech company that's contributing to the fight against deepfakes. Facebook and Microsoft are also involved in an industry-wide initiative to create a set of open source tools that companies, governments and media organizations can use to detect doctored videos. The social network plans to release a similar database by the end of the year, and we doubt Google would mind -- the more samples there are, after all, the better detection tools can become.