Each record contained data scraped from influencers' accounts. In some cases, the location of accounts, as well as owners' email addresses and phone numbers were listed. Each record also contained an estimated worth of the account, based on followers, engagement and reach. Instagram has since confirmed that while some phone numbers and email addresses were exposed, that data was not scraped from Instagram's database.
"We're looking into the issue to understand if the data described -- including email and phone numbers -- was from Instagram or from other sources," an Instagram spokesperson told Engadget initially. "We're also inquiring with Chtrbox to understand where this data came from and how it became publicly available." However, Instagram has since confirmed that while some phone numbers and email addresses were exposed, that data was not scraped from Instagram's database. Security researcher Anurag Sen found the database, and TechCrunch reportedly traced it back to Chtrbox, a Mumbai-based social media marketing firm.
When TechCrunch contacted the company, the database was removed, but Chtrbox did not respond to TechCrunch's request for comment. Instagram says that the information in Chtrbox's database came from multiple sources, but the data that was gathered from Instagram was done so in violation of Instagram's policies. As such, Instagram has since revoked Chtrbox's access to the platform.
Instead, that data was gathered in one of three ways: Chtrbox users in India who signed up directly to Chtrbox would have shared their phone and email addresses. Chtrbox also collected phone numbers and email addresses when shared publicly on someone's Instagram profile. Finally, some Chtrbox team members collected contact information through other unspecified online research, offline marketing and in-person meetups in India.
According to Facebook, scraping data of any kind is prohibited on Instagram, but it's still unclear how the data was obtained or how it may have been used. In the past, we have seen hackers try to sell celebrity data scraped from Instagram, and the platform has faced its own security issues -- like storing passwords in plain text and a bug that exposed some users' passwords. As Facebook works to emphasize privacy, it will have to address Instagram's vulnerabilities as well.
Update, 5/23/19, 5:30PM ET: Since this story was originally published, Instagram confirmed to Engadget that any phone numbers that may have been exposed in the database did not come from Instagram's API. Furthermore, Instagram confirmed that about 350,000 accounts were accessed, not the originally-reported 49 million. This story has been updated with those details and additional information provided by Instagram.