Microsoft Copilot Bug Exposes Confidential Emails To AI Tool

A coding error inside Microsoft 365 Copilot briefly allowed the AI tool to read and summarise emails that businesses had explicitly marked as confidential.

A Safeguard That Didn’t Hold

In January, Microsoft detected an issue inside the “Work” tab of Microsoft 365 Copilot Chat. The problem, tracked internally as CW1226324, meant Copilot could process emails stored in users’ Sent Items and Drafts folders, even when those messages carried sensitivity labels designed to block AI access.

Inbox folders appear to have remained protected. The weakness sat in a specific retrieval path affecting Drafts and Sent Items.

Microsoft confirmed the bug was first identified on 21 January 2026. A server-side fix began rolling out in early February and is still being monitored across enterprise tenants.

The company said in a statement:

“We identified and addressed an issue where Microsoft 365 Copilot Chat could return content from emails labelled confidential, authored by a user and stored within their Draft and Sent Items in Outlook desktop.”

It added:

“This did not provide anyone access to information they weren’t already authorised to see. While our access controls and data protection policies remained intact, this behaviour did not meet our intended Copilot experience, which is designed to exclude protected content from Copilot access.”

That distinction matters. Microsoft’s position is that no unauthorised user gained access to restricted data. The issue was about Copilot processing information it was supposed to ignore.

How Did This Happen?

Copilot relies on what’s known as a retrieve then generate model. It first pulls relevant content from emails, documents or chats. It then feeds that material into a large language model to produce summaries or answers.

The enforcement point is the retrieval stage. If protected content is fetched at that stage, the AI will use it.

In this case, a code logic error meant sensitivity labels and data loss prevention policies were not correctly enforced for Drafts and Sent Items. Emails marked confidential were picked up and summarised inside Copilot’s Work chat.

That creates obvious concerns. Draft folders often contain unfinalised legal advice, internal assessments or sensitive negotiations. Sent Items frequently hold commercially sensitive exchanges.

Even if summaries stayed within the same user’s workspace, the principle of exclusion had failed.

Why It Happened At An Awkward Moment

Microsoft has been aggressively positioning Microsoft 365 Copilot as a secure enterprise AI assistant. Businesses pay a premium licence fee on top of their Microsoft 365 subscriptions. The selling point is productivity without compromising governance.

This incident seems to undermine that message.

It also comes amid heightened scrutiny of AI tools in regulated environments. The European Parliament recently banned AI tools on some worker devices over cloud data concerns. Regulators are watching closely.

Industry analysts have long warned that the rapid rollout of enterprise AI features increases the likelihood of control gaps and configuration errors. As vendors compete to embed generative AI deeper into core productivity tools, governance frameworks are often forced to catch up. This incident reinforces a wider concern that AI functionality can move faster than internal compliance oversight.

Security researchers have previously highlighted vulnerabilities in retrieval augmented generation systems, including those used by Copilot. The lesson is consistent. If policy enforcement fails at retrieval, downstream safeguards cannot fully compensate.

What This Means For Microsoft And Its Rivals

Copilot sits at the centre of Microsoft’s enterprise AI strategy, so any weakness in its data controls lands hard. Businesses are being asked to trust an assistant that can read across emails, documents and internal chats. That trust is commercial currency.

In Microsoft’s defence, it must be said that the company moved quickly to contain the issue. The fix was applied server-side, so customers did not need to install patches, and the company says it is contacting affected tenants while monitoring the rollout. From a technical response standpoint, the reaction has been swift.

Microsoft has yet to publish tenant-level figures or detailed forensic logs showing exactly which confidential items were processed during the exposure window. For organisations with regulatory obligations, reassurance alone will not be enough. They will want clear evidence of what was accessed, when and under what controls.

Rivals will also be paying attention. Google Workspace with Gemini, Salesforce’s AI integrations and other embedded assistants rely on similar retrieval architectures. The risk exposed here is not unique to one vendor. It reflects a broader design challenge facing every platform embedding generative AI into live corporate data environments.

What Does This Mean For Your Business?

If your organisation is using Microsoft 365 Copilot, this is a governance story, not a crisis story.

Microsoft insists no unauthorised access took place and there is no evidence of data being exposed outside permitted user boundaries. That matters. Yet the episode highlights something more structural. AI controls can fail quietly inside systems businesses assume are ring-fenced.

Copilot is not a standalone chatbot. It operates across your email, documents and collaboration tools. It reads broadly. It summarises intelligently. It relies on retrieval rules working exactly as designed. When those rules misfire, even briefly, sensitive material can be processed in ways you did not intend.

That is why access decisions matter. Embedding AI into legal, HR, finance or executive workflows is not simply a productivity choice. Draft emails often contain unfiltered strategy, regulatory advice or negotiation positions. Those are precisely the communications organisations most want tightly controlled.

This is also a moment to test assumptions. Sensitivity labels and data loss prevention policies are only effective if they behave as expected under real conditions. Enabling new AI features should trigger validation, not blind trust.

Copilot can deliver genuine efficiency gains. Faster document drafting, quicker retrieval of buried information and less manual searching all translate into time saved. The value is real. Yet tools with that level of visibility into your data estate deserve the same scrutiny you would apply to any system handling commercially sensitive information.

Businesses that combine productivity ambition with disciplined oversight will benefit. Those that treat embedded AI as frictionless and risk-free may find the learning curve steeper than expected.