Microsoft’s legal department allegedly silenced an engineer who raised concerns about DALL-E 3

Microsoft and OpenAI told Engadget the technique in question didn’t bypass their safety filters.

Contributing Reporter

Updated Tue, Jan 30, 2024, 8:41 PM·5 min read

A Microsoft manager claims OpenAI’s DALL-E 3 has security vulnerabilities that could allow users to generate violent or explicit images (similar to those that recently targeted Taylor Swift). GeekWire reported Tuesday the company’s legal team blocked Microsoft engineering leader Shane Jones’ attempts to alert the public about the exploit. The self-described whistleblower is now taking his message to Capitol Hill.

“I reached the conclusion that DALL·E 3 posed a public safety risk and should be removed from public use until OpenAI could address the risks associated with this model,” Jones wrote to US Senators Patty Murray (D-WA) and Maria Cantwell (D-WA), Rep. Adam Smith (D-WA 9th District), and Washington state Attorney General Bob Ferguson (D). GeekWire published Jones’ full letter.

Jones claims he discovered an exploit allowing him to bypass DALL-E 3’s security guardrails in early December. He says he reported the issue to his superiors at Microsoft, who instructed him to “personally report the issue directly to OpenAI.” After doing so, he claims he learned that the flaw could allow the generation of “violent and disturbing harmful images.”

Jones then attempted to take his cause public in a LinkedIn post. “On the morning of December 14, 2023 I publicly published a letter on LinkedIn to OpenAI’s non-profit board of directors urging them to suspend the availability of DALL·E 3),” Jones wrote. “Because Microsoft is a board observer at OpenAI and I had previously shared my concerns with my leadership team, I promptly made Microsoft aware of the letter I had posted.”

AI-generated image of a teacup with a violent wave inside of it. A storm brews from behind the window sill behind it. — *A sample image (a storm in a teacup) generated by DALL-E 3* (OpenAI)

Microsoft’s response was allegedly to demand he remove his post. “Shortly after disclosing the letter to my leadership team, my manager contacted me and told me that Microsoft’s legal department had demanded that I delete the post,” he wrote in his letter. “He told me that Microsoft’s legal department would follow up with their specific justification for the takedown request via email very soon, and that I needed to delete it immediately without waiting for the email from legal.”

Jones complied, but he says the more fine-grained response from Microsoft’s legal team never arrived. “I never received an explanation or justification from them,” he wrote. He says further attempts to learn more from the company’s legal department were ignored. “Microsoft’s legal department has still not responded or communicated directly with me,” he wrote.

An OpenAI spokesperson wrote to Engadget in an email, “We immediately investigated the Microsoft employee’s report when we received it on December 1 and confirmed that the technique he shared does not bypass our safety systems. Safety is our priority and we take a multi-pronged approach. In the underlying DALL-E 3 model, we’ve worked to filter the most explicit content from its training data including graphic sexual and violent content, and have developed robust image classifiers that steer the model away from generating harmful images.

“We’ve also implemented additional safeguards for our products, ChatGPT and the DALL-E API – including declining requests that ask for a public figure by name,” the OpenAI spokesperson continued. “We identify and refuse messages that violate our policies and filter all generated images before they are shown to the user. We use external expert red teaming to test for misuse and strengthen our safeguards.”

Meanwhile, a Microsoft spokesperson wrote to Engadget, “We are committed to addressing any and all concerns employees have in accordance with our company policies, and appreciate the employee’s effort in studying and testing our latest technology to further enhance its safety. When it comes to safety bypasses or concerns that could have a potential impact on our services or our partners, we have established robust internal reporting channels to properly investigate and remediate any issues, which we recommended that the employee utilize so we could appropriately validate and test his concerns before escalating it publicly.”

“Since his report concerned an OpenAI product, we encouraged him to report through OpenAI’s standard reporting channels and one of our senior product leaders shared the employee’s feedback with OpenAI, who investigated the matter right away,” wrote the Microsoft spokesperson. “At the same time, our teams investigated and confirmed that the techniques reported did not bypass our safety filters in any of our AI-powered image generation solutions. Employee feedback is a critical part of our culture, and we are connecting with this colleague to address any remaining concerns he may have.”

Microsoft added that its Office of Responsible AI has established an internal reporting tool for employees to report and escalate concerns about AI models.

The whistleblower says the pornographic deepfakes of Taylor Swift that circulated on X last week are one illustration of what similar vulnerabilities could produce if left unchecked. 404 Media reported Monday that Microsoft Designer, which uses DALL-E 3 as a backend, was part of the deepfakers’ toolset that made the video. The publication claims Microsoft, after being notified, patched that particular loophole.

“Microsoft was aware of these vulnerabilities and the potential for abuse,” Jones concluded. It isn’t clear if the exploits used to make the Swift deepfake were directly related to those Jones reported in December.

Jones urges his representatives in Washington, DC, to take action. He suggests the US government create a system for reporting and tracking specific AI vulnerabilities — while protecting employees like him who speak out. “We need to hold companies accountable for the safety of their products and their responsibility to disclose known risks to the public,” he wrote. “Concerned employees, like myself, should not be intimidated into staying silent.”

Update, January 30, 2024, 8:41 PM ET: This story has been updated to add statements to Engadget from OpenAI and Microsoft.

Engadget
ISPs are fighting to raise the price of low-income broadband
Internet service providers are objected to the lower rates they need to offer lower income customers if they want to obtain government funds from a new Internet access program.
Engadget
Amazon is giving The Boys the prequel treatment
The cast and crew of Amazon's The Boys announced a bunch of new spinoffs for the supe action series.
Engadget
You can date everything in Date Everything!
Date Everything! is an upcoming dating sim game that lets you date evert
Engadget
The Bioshock movie is still happening but with a reduced budget
The Bioshock movie is still happening, but with steep budget cuts. It’s being reconfigured to become a ‘more personal’ film.
Engadget
Warner Bros. Discovery sues the NBA in a last-ditch effort to block Amazon’s new streaming package
Warner Bros. Discovery followed through on its threat to “take appropriate action” against the NBA for rejecting its broadcasting rights offer. On Friday, the media company sued the league after the NBA turned down its bid to match Amazon’s streaming package.
Engadget
Apple’s M3 MacBook Air with 16GB of RAM is $200 off right now
Apple’s M3 MacBook Air combines Apple’s lightest and thinnest laptop design with the cutting-edge horsepower of the latest Apple silicon chip. You can get the 2024 model on sale for $200 off right now.
Engadget
Here's how to stop Grok's AI models using your tweets for training
X automatically opted users into letting Grok's AI models train on their tweets and interactions with the chatbot. Here's how to opt out.
Engadget
The 10th-generation iPad is back down to $300, plus the rest of this week's best tech deals
The week after Amazon's Prime Day can be a bit sleepy for deals, but we still found a few decent discounts on gear we've tested and recommend.
Engadget
The 65-inch LG C3 OLED TV is nearly half off for today only
The 65-inch LG C3 OLED TV is nearly half off for today only. That brings the set down to a record low of $1,300.
Engadget
NASA's Perseverance rover found a rock on Mars that could indicate ancient life
A Martian rock sample collected by Perseverance contains "chemical signatures and structures" that could've been formed by ancient microbial life from billions of years ago.
Engadget
Apple agrees to stick by Biden administration's voluntary AI safeguards
Apple has joined more than a dozen other tech companies in signing up for the Biden administration's voluntary AI code of practice.
Engadget
North Korean who used ransomware to attack US healthcare providers has been indicted
A grand jury in Kansas City has indicted Rim Jong Hyok, a North Korean intelligence operative who allegedly used ransomware to attack health providers' systems in the US.
Engadget
Samsung Galaxy Ring review: A bit basic, a bit pricey
The Galaxy Ring is comfortable and seemingly basic, but actually delivers detailed insight on your sleep, walks and runs.
Engadget
Apple's 14-inch MacBook Pro laptop with an M3 Pro chip is $300 off at Amazon
Apple's well-specked 14-inch MacBook Pro with an M3 Pro chip, 18GB of memory and 512GB of storage is on sale for the lowest price we've seen yet at Amazon.
Engadget
Gran Turismo 7's more realistic physics update is launching cars into orbit
Gran Turismo 7's latest update is causing some bizarre problems, making cars bounce violently or launch completely into the air.
Engadget
The Morning After: OpenAI reveals its AI-powered search engine, SearchGPT
The biggest news stories this morning: AI video startup Runway reportedly trained on ‘thousands’ of YouTube videos without permission, The best cameras for 2024, WhatsApp hits 100 million monthly active US users.
Engadget
The best fitness trackers for 2024
Here's a list of the best fitness trackers you can buy, as chosen by Engadget editors.
Engadget
The best cameras for 2024
Here's a list of the best cameras you can buy, as chosen by Engadget editors.
Engadget
X's Grok chatbot is misleading voters about the presidential election
Grok's AI chatbot claims that President Biden's name must stay on the ballot in nine states, a claim that is categorically false.
Engadget
Comic-Con leak sparks rumors of two remastered Soul Reaver games
A photo from Comic-Con has leaked possible remasters of two Soul Reaver games from Crystal Dynamics.

Microsoft’s legal department allegedly silenced an engineer who raised concerns about DALL-E 3

Microsoft and OpenAI told Engadget the technique in question didn’t bypass their safety filters.

Latest Stories

ISPs are fighting to raise the price of low-income broadband

Amazon is giving The Boys the prequel treatment

You can date everything in Date Everything!

The Bioshock movie is still happening but with a reduced budget

Warner Bros. Discovery sues the NBA in a last-ditch effort to block Amazon’s new streaming package

Apple’s M3 MacBook Air with 16GB of RAM is $200 off right now

Here's how to stop Grok's AI models using your tweets for training

The 10th-generation iPad is back down to $300, plus the rest of this week's best tech deals

The 65-inch LG C3 OLED TV is nearly half off for today only

NASA's Perseverance rover found a rock on Mars that could indicate ancient life

Apple agrees to stick by Biden administration's voluntary AI safeguards

North Korean who used ransomware to attack US healthcare providers has been indicted

Samsung Galaxy Ring review: A bit basic, a bit pricey

Apple's 14-inch MacBook Pro laptop with an M3 Pro chip is $300 off at Amazon

Gran Turismo 7's more realistic physics update is launching cars into orbit

The Morning After: OpenAI reveals its AI-powered search engine, SearchGPT

The best fitness trackers for 2024

The best cameras for 2024

X's Grok chatbot is misleading voters about the presidential election

Comic-Con leak sparks rumors of two remastered Soul Reaver games

About

Sections

Contribute

Buying Guides