The database was originally published in 2016, described by Microsoft as the largest publicly available facial recognition data set in the world, and used to train facial recognition systems by global tech firms and military researchers. The people whose photos appear in the set were not asked for consent, but as the individuals were considered celebrities (hence the set's name), the images were pulled from the internet under a Creative Commons license.
However, MS Celeb -- which was uncovered by Berlin-based researcher Adam Harvey -- also contained images of what FT.com calls "arguably private individuals" such as security journalists and authors. Speaking to the FT.com, Harvey -- who runs a project called Megapixels which reveals details on such data sets -- also says that even though MS Celeb has been deleted, its contents are still being shared around the web. "You can't make a data set disappear. Once you post it, and people download it, it exists on hard drives all over the world," he said.
The deletion comes not long after FT.com ran an in-depth investigation on facial recognition technology, and Microsoft's role in it. However, Microsoft explained the data set's deletion to FT.com as a simple matter of protocol. "The site was intended for academic purposes," it said. "It was run by an employee that is no longer with Microsoft." This might explain why the company hasn't been particularly vocal about the move -- internal procedure can't really be considered the same as a bold gesture of public goodwill. Nonetheless, it demonstrates that Microsoft is as committed to legislative adherence as it wants everyone else to be.