A security researcher has discovered a huge cache of data for millions of Instagram accounts, which are publicly accessible to everyone. The account contained sensitive information that would be useful for cyberstalkers, among others.
A security researcher who calls himself anurag sen on Twitter discovered the database hosted on Amazon Web Services. It had more than 49 million records when it was discovered and still grew before it was deleted.
The Instagram data includes user bios, profile photo & # 39; s, sequence numbers, and location. This information is visible online. What is more astonishing is that, according to Techcrunch, it also contained the e-mail address and telephone number for setting up the accounts, which broke the story.
Reporters identified the owner of the database as Chtrbox, a social media company from Mumbai. It pays social media influencers to publish sponsored content through their accounts. The database has since disappeared from Amazon.
Chatrbox ran into problems with press coverage of the leaked records, and sent Naked Security the following statement:
The reports about a leak of private data are not accurate. A certain database for limited influencers was inadvertently exposed for approximately 72 hours. This database did not contain sensitive personal data and only contained available information from the public domain or self-reported by influencers.
We also want to confirm that no personal data has been obtained through unethical means by Chtrbox. Our database is for internal research purposes only, we have never sold individual data or our database and we have never purchased hacked data that is the result of breaches of the social media platform. Our use of our database is limited to help our team connect with the right influencers to help influencers make money with their online presence and to help brands create great content.
How can anyone compile a huge database of Instagram information?
The company would not answer any further questions, so it's hard to know for sure. Usernames, profile images and sequence numbers are publicly available and can be collected by scraping the screen. Screen scrapers use automated scripts to visit websites and copy the information they find there.
Companies use scaled data for all kinds of purposes, such as price comparisons and sentiment analysis. It is considered harmful and many publishers try to block it because the scrapers use their own data and also empty their server resources.
We have seen people who scraped up Instagram before. Redditors tried to archive every image of the site that they could for spades.
But it can get you in trouble. Authorities in Nova Scotia, Canada arrested a 19-year-old for scraping about 7,000 freedom-of-information releases from a public website there and calling him a hacker. They then left the charges.
What is generally not public is the phone number and e-mail address used to create the account and which is included in some records according to TechCrunch. Facebook made this available via the Instagram API, even for accounts that did not make that information public. It had to disable this feature in September 2017 after it found people downloading celebrity contact addresses.