A new cross-national audit found that nine out of ten popular AI-powered browser assistants silently funnel private data—including social security numbers, medical records, and bank details—from authenticated sessions to remote servers, often without clear user consent. The study, conducted by researchers from University College London and Italian institutions, tested ten widely used generative-AI browser extensions and integrations across routine browsing tasks, from public web searches to password-protected portals for health, banking, and taxes. Their findings upend the assumption that logging into a site or using private browsing modes stops these tools from harvesting sensitive page content.

How the audit worked: simulating real-world use

The researchers built a controlled testing environment that mimicked typical user behavior. They installed each assistant on a Chromium-based browser and performed a series of tasks: visiting public shopping sites, then logging into authenticated university health portals, bank accounts, and a U.S. tax preparation website. After each task, they prompted the assistant with contextual questions—like “What was the purpose of the current medical visit?”—and decrypted all outgoing network traffic between the browser, the assistant’s servers, and third-party trackers.

The methodology combined traffic analysis with a repeatable prompting framework, allowing the team to observe exactly what data left the device. This approach exposed a stark gap between what users likely expect and what is actually transmitted.

The alarming results: page contents, form inputs, and profiling

Every assistant tested except Perplexity AI showed evidence of transmitting private content to first-party servers during the experiments. Several assistants routinely sent the full page content, including HTML DOMs or large extracts of visible text, from authenticated pages. This means everything a user saw on a protected health portal or online banking page was uploaded to the assistant’s backend.

Merlin, a popular Chrome extension, went further: it recorded form-input values. During the test, Merlin captured a social security number entered on a U.S. tax portal, along with similarly sensitive inputs on bank and health pages. The researchers also found that Sider and TinaMind sent user prompts and identifiers—including IP addresses and chat session IDs—to third-party analytics platforms like Google Analytics, enabling potential cross-site tracking and ad targeting. Several assistants, including ChatGPT integrations, Microsoft Copilot, Monica, and Sider, built persistent profiles that inferred users’ age, gender, income level, and interests, then used those attributes to tailor responses across browsing sessions.

Copilot and some ChatGPT integrations stored complete chat histories in the browser background after sessions ended, suggesting data footprints that outlive user interactions. The study’s authors note that these behaviors were consistent across multiple replicable tests, and independent coverage by outlets like Euronews and The Register confirmed the core claims.

Technical breakdown: why this happens

Most AI browser assistants rely on server-side inference rather than running models locally on the device. To provide features like page summarization, question answering, or cross-tab context, they inject a small content script into every web page. That script has full access to the Document Object Model (DOM) and any visible text. When triggered—sometimes automatically on page load—the script sends page content to a remote server for processing. The audit found this to be the dominant architecture: heavy lifting happens in the cloud, not on your machine.

Background service workers can activate without an explicit user click, forwarding content silently. This is by design to enable seamless experiences, but it also means data can leave your browser before you realize it. The inclusion of third-party analytics endpoints compounds the problem—when raw prompts, timestamps, or IP addresses are shared with trackers like Google Analytics, those events become linkable to a wider advertising graph, heightening the risk of surveillance.

The study flags potential violations of both U.S. and EU privacy laws. Transmission of protected health information (PHI) from authenticated health portals to unvetted third-party servers could violate HIPAA if covered entities or business associates are involved. While the tests were conducted in a lab setting, the researchers argue that the patterns observed are likely inconsistent with the GDPR—particularly regarding cross-border data flows, profiling without clear lawful bases, and lack of data minimization. The paper stops short of declaring outright violations, as that would require formal regulatory proceedings, but it signals significant legal risk for vendors.

European regulators have been intensifying scrutiny of AI and data processing. The audit’s documented instances of sensitive data sent outside the EU, combined with opaque profiling, align with enforcement priorities under the GDPR. In the U.S., state attorneys general and sectoral regulators may also take interest, especially given the explicit capture of social security numbers and financial data.

What vendors say—and what their policies conceal

The study juxtaposed network activity with published privacy policies. Merlin’s EU/UK policy, for example, broadly states it may collect names, credentials, transaction history, and typed inputs for personalization—meaning the form grabbing observed in the audit is technically disclosed, albeit in dense legalese most users never read. Sider’s policy similarly admits to using partners like Google and Microsoft for service operation, which aligns with the detected analytics traffic. OpenAI, meanwhile, has acknowledged that UK/EU user data may be stored outside those regions, and CEO Sam Altman publicly warned that conversations with ChatGPT lack therapist–client legal confidentiality.

These disclosures often exist, but the gap between what is written and what users understand is immense. Users rarely grasp that “we collect data to improve our services” translates to “your banking page content and medical visit details may be sent to our servers and processed by third parties.”

The value: why millions install these assistants anyway

Despite the privacy concerns, these tools deliver real productivity gains. Server-side models produce fast, accurate summaries and answers that slash research time. Conversational interfaces lower barriers for users with disabilities or those overwhelmed by complex documents. Cross-tab memory and task automation turn assistants into lightweight personal agents, streamlining workflows for knowledge workers. These benefits drive rapid adoption, and they partly justify the centralized architecture that vendors prefer.

The hidden costs: beyond privacy to real-world harms

When sensitive data is coupled with persistent profiling, the risks multiply. Scammers could use inferred financial profiles to craft highly targeted phishing attacks. Legal exposure for users is real: as Altman noted, AI chats aren’t privileged; uploaded medical or financial details could surface in litigation. Vendors also face regulatory fines, breach notification costs, and reputational damage. A false sense of security in private browsing modes exacerbates the danger—incognito mode does not block extension content scripts from reading the page.

Caveats: what the audit doesn’t prove

The researchers acknowledge important limitations. Tests were performed on specific extension versions at a point in time; behavior can change with updates. The audit shows what data left the browser, not that it was misused or retained beyond product operations. Legal conclusions are expert risk assessments, not court rulings. These nuances matter, but they don’t diminish the urgency of the central finding: many assistants default to overcollection.

Protecting yourself: a practical guide for Windows and browser users

Immediate steps can reduce your exposure without abandoning AI assistants entirely.

  • Audit your extensions: Open your browser’s extensions page and remove or disable any AI assistants you don’t actively use. Revoke site access permissions for those you keep, especially on banking, health, and tax sites.
  • Use site-specific controls: Set extensions to “on click” or “only on specific sites” so content scripts aren’t injected automatically on sensitive pages.
  • Prefer privacy-respecting alternatives: Perplexity stood out as significantly more privacy-conscious in the audit. Look for assistants that avoid uploading page content or that operate without access to authenticated sessions.
  • Never paste sensitive data into prompts: Do not type passwords, SSNs, medical records, or full account numbers into a general chat prompt.
  • Use dedicated apps for critical tasks: Access your bank via its official app without extensions; keep AI assistants closed during authenticated health portal sessions.
  • Consider local LLMs: For high-sensitivity workflows, run an on-device or on-premises model to keep data off third-party servers.
  • Read policies critically: Focus on how data is shared, not just whether it’s “sold.” Look for explicit statements about analytics sharing and data retention.

What browser makers and vendors should do

The study lays out a clear remediation path:

  • Data minimization by default: Stop full DOM or form-input collection. Request only the minimal text needed for each feature, and obtain explicit, contextual consent.
  • On-device inference options: Move privacy-sensitive features to client-side models where possible, with a clear “local-only” mode.
  • Transparent disclosures: When uploading content, show a one-click explanation of exactly what is sent, where it’s stored, and for how long.
  • Opt-in profiling and deletion controls: Profiling must be opt-in, with easy deletion of all associated data.
  • Independent auditing: Commission third-party privacy audits and publish methodologies to allow validation by researchers and regulators.
  • Separate analytics from content: Avoid linking raw prompts and chat IDs to broad analytics services that enable cross-site tracking.

The road ahead: policy and market pressure

Regulators are already watching. The European Data Protection Board and national authorities have made AI-driven data processing a priority. The U.S. Federal Trade Commission and state enforcers are increasingly active on health and financial data. The patterns revealed by this audit—especially the capture of social security numbers and protected health information—are exactly the kind that trigger investigations. Companies that ignore these warnings risk not just fines but mandated product changes.

At the same time, market pressure can drive change. Users who demand transparency and migrate to privacy-focused alternatives will influence development roadmaps. The audit shows that a more privacy-respecting model is technically feasible; Perplexity’s behavior in these tests proves that an assistant can deliver value without hoovering up every scrap of sensitive page content.

The bottom line

The generative-AI browser assistant audit is a wake-up call for everyone who assumed their personal data remained walled off inside password-protected portals. Server-side inference powers impressive features, but it also exposes the most intimate corners of your online life to third parties. The good news is that the risks are addressable through better architecture, transparent policies, and user awareness. Until vendors adopt these measures, treat every AI browser extension as a potential data pipeline and act accordingly.