Microsoft’s AI team accidentally exposes 38TB of private data via Azure

Microsoft's AI research team accidentally exposes 38TB of sensitive data via Azure a report says. However, Microsoft has denied the claims completely.

Published By: Shweta Ganjoo | Published: Sep 19, 2023, 09:01 PM (IST)

Highlights

Microsoft's AI team reportedly leaked 38TB of sensitive data accidentally.
This leak took place via Azure accesses back in 2020.
Microsoft has denied all such reports completely.

techlusive.in Written By article news — Written By Shweta Ganjoo

Microsoft data leak: Microsoft AI researchers accidentally leaked dozens of terabytes of data, including private keys and passwords, while sharing a bucket of open-source training data on GitHub. What’s worrisome is that this data leak had been happening since July 2020 and it’s only recently that the company fixed this issue.Also Read: 8 ChatGPT prompts to make your Instagram photos look better

According to a report by a security research firm Wiz, Microsoft’s AI research team, while publishing a bucket of open-source training data on GitHub, accidentally exposed 38 terabytes of private data, which includes secrets, private keys, passwords, over 30,000 internal Microsoft Teams messages and a disk backup of two employees’ workstations.Also Read: Your free WhatsApp, Instagram, Facebook will not remain same anymore!

What exactly happened?

Wiz researchers found a repository belonging to Microsoft’s AI research division. This repository had been created to provide open-source code and AI models for image recognition or in other words it had been created to help to provide AI models for use in training code. All those who were working on developing and training the AI model and needed the access to this repository were asked to download the code from an Azure Storage URL. All those who downloaded the code from the URL were given access to something called a ‘Shared Access Signature (SAS) token’. For the unversed, the access level can be customised by the user and that the permissions range between read-only and full control. The expiry time is also completely customisable, allowing the user to create never-expiring access tokens.

According to a report by the Wiz researchers, the Azure Storage URL not only gave developers the access to the open-source models, but it also granted them permissions on the entire storage account. “In addition to the overly permissive access scope, the token was also misconfigured to allow “full control” permissions instead of read-only. Meaning, not only could an attacker view all the files in the storage account, but they could delete and overwrite existing files as well,” Wiz wrote in a blog post.

“Our scan shows that this account contained 38TB of additional data — including Microsoft employees’ personal computer backups. The backups contained sensitive personal data, including passwords to Microsoft services, secret keys, and over 30,000 internal Microsoft Teams messages from 359 Microsoft employees,” the company added.

The silver lining in the matter is that the Azure storage account wasn’t directly exposed to the public. Instead, it was a private storage account. Wiz researchers reported the issue to Microsoft in June 2023 and it was quickly resolved in two days.

Add Techlusive as a Preferred Source

What is Microsoft saying and how can I safeguard myself?

Microsoft, in a post on its MSRC blog, confirmed the issue. It also said that no user data was put at risk and that users don’t need to take any action on their part. “No customer data was exposed, and no other internal services were put at risk because of this issue. No customer action is required in response to this issue,” Microsoft wrote in a blog post.