AI Platforms Don’t Leak Sensitive User Information New Search Atlas Study Finds
New York, United States – March 21, 2026 / Search Atlas /
NEW YORK CITY, NY, March 19, 2026 — Search Atlas, a leading SEO and digital intelligence platform, today released findings from a controlled study examining what actually happens to sensitive information entered into major AI platforms. The study evaluated six leading large language models (LLMs), OpenAI, Gemini, Perplexity, Grok, Copilot, and Google AI Mode, through two controlled experiments designed to simulate worst-case data exposure scenarios.
The results offer meaningful reassurance for businesses and individuals concerned about confidential information shared with AI tools. Across all six platforms tested, researchers found 0% data leakage of user-provided sensitive information.
The full study is available here.
Key Findings:
- LLMs don’t retain or replay user-provided sensitive information (0% data leakage across all platforms tested)
- Retrieved facts vanish when search is off (no evidence of short-term retention or leakage)
- Users risk AI hallucinations, not data exposure
Led by researchers at Search Atlas, the study evaluated six major LLM platforms (OpenAI, Gemini, Perplexity, Grok, Copilot, and Google AI Mode) through two controlled experiments designed to simulate worst-case data exposure scenarios. The results provide meaningful reassurance for businesses and individuals concerned about what happens to confidential information shared with AI tools.
1. LLMs don’t retain or replay user-provided sensitive information – 0% data leakage across all platforms tested
The study tested whether AI models would repeat private information after being directly exposed to it. Researchers created 30 question-and-answer pairs with no public information provided, no search indexing, online references, or presence in the known training data.
Each model went through a three-step process:
- The questions were asked with no prior context
- Researchers then provided the correct answers
- The same questions were asked again to see whether the models would repeat the newly introduced information
Across all six platforms tested, none reproduced a single correct answer after exposure. Models that initially declined to answer continued to decline, while those prone to hallucinating answers continued to generate incorrect responses rather than repeating the injected facts. In short, model behavior remained essentially unchanged before and after exposure.
This setup simulated a worst-case scenario in which a user enters proprietary or sensitive information into an AI system. Under these conditions, the study found no evidence that the information carries over into future responses.
The experiment also revealed behavioral differences across platforms. Models from OpenAI, Perplexity, and Grok tended to respond with uncertainty when reliable information was unavailable, resulting in more “I don’t know” responses. Gemini, Copilot, and Google AI Mode were more likely to generate confident but incorrect answers. However, none of those incorrect responses matched the previously provided private information. The findings highlight a key distinction: hallucination (making up incorrect information) is not the same as leakage. Hallucination and leakage are distinct failure modes, and this study found only the former.
2. Retrieved facts vanish when search is off – no evidence of short-term retention or leakage
The second experiment tested whether information retrieved through live web search would remain and reappear in a model’s responses once search access was turned off.
To isolate this effect, researchers selected a real-world event that occurred after the training cutoff of all models tested. This ensured that any correct answers during the experiment could only come from live web retrieval, not from the models’ existing knowledge.
When search was enabled, the models answered the vast majority of questions correctly. However, when search was immediately disabled and the same questions were asked again, those correct answers largely disappeared.
The only questions that models could still answer correctly without search were ones whose answers could reasonably be inferred from pre-existing training data or general knowledge, rather than from information retrieved moments earlier.
In short, the results showed no evidence that models retained or carried forward information retrieved through live search. Once retrieval access was removed, the information no longer appeared in responses, suggesting that the systems do not store or pass along facts obtained during a prior interaction.
3. Users risk AI hallucinations, not data exposure
One of the study’s most practical findings is the clear distinction between hallucination and data leakage. The platforms that showed lower accuracy were Gemini, Copilot and Google AI Mode, and did not do so by repeating information they had previously been given. Instead, their errors came from generating confident, plausible-sounding answers that were simply incorrect. Open AI (ChatGPT) and Perplexity demonstrated the lowest hallucination levels.
This distinction matters when evaluating AI risk. A common concern is that an AI system might expose sensitive information from one user to another. In this study, researchers found no evidence supporting that scenario.
The more consistently observed issue was hallucination (models filling gaps in their knowledge with invented facts). While this does not involve the sharing of private information, it introduces a different challenge: people and organizations must ensure that AI-generated responses are reviewed and verified, particularly in contexts where accuracy is critical.
What This Means
For businesses and privacy-conscious users, the results offer reassuring news. If you share sensitive information with an AI model in a single session, such as a proprietary business strategy or private detail, the model does not appear to absorb that information into a lasting memory that could be surfaced to other users. Instead, the data functions more like temporary “working memory” used to generate a response within that interaction.
For researchers and fact-checkers, these findings also highlight an important limitation. You cannot expect an LLM to “learn” from a correction you provide in a previous conversation. If a model contains an error in its underlying training data, it may continue to repeat that mistake in future sessions unless the model itself is retrained or the correct source is supplied again.
For developers and AI builders, the study reinforces the importance of retrieval-based systems. Approaches such as Retrieval-Augmented Generation (RAG), which connect models to live databases or search systems remain the most reliable way to keep AI responses accurate for current events, proprietary information, or frequently updated data. Without retrieval, the model has no built-in mechanism to retain facts discovered during earlier interactions.
“A lot of the anxiety around enterprise AI adoption comes from a reasonable but untested assumption that if you put sensitive information into one of these systems, it will somehow find its way out,” said Manick Bhan, Founder of Search Atlas. “What we wanted to do was actually test that assumption under controlled conditions rather than speculate. Across every platform we evaluated, the data didn’t support it. That doesn’t mean AI is risk-free, hallucination is a real and documented problem, but the specific fear that your data gets leaked to the next user isn’t something we found any evidence for. We hope that gives people and organizations the confidence to engage with these tools more clearly, and to focus their attention on the risks that are actually there.”
Methodology
The study, conducted by Search Atlas, put six major LLM platforms, OpenAI, Gemini, Perplexity, Grok, Copilot, and Google AI Mode, through a rigorous, multi-stage experiment to determine whether they retain or leak information provided during a session. The process followed three steps.
First, researchers introduced unique, non-public facts into each model through two methods: direct user prompts and simulated web search results. The facts were entirely synthetic information that did not exist anywhere online and had no presence in known training data ensuring that any correct answer produced by a model could only be explained by retention of what it had been shown.
Next, after each model was exposed to this private data, researchers tested whether it could be triggered into revealing those facts in a fresh interaction, with no search access and no contextual references to the original exposure. This isolated session design was intended to replicate the realistic concern: that information shared with an AI in one conversation might surface for another user later.
Finally, the team measured two metrics across all platforms before and after exposure: the True Response Rate, meaning how often a model correctly recalled the private fact, and the Hallucination Rate, meaning how often it produced a confident but incorrect answer instead. Comparing these figures before and after data exposure allowed researchers to determine whether models were genuinely retaining new information or simply behaving as they always had. Across all six platforms, the answer was the latter.
Contact Information:
Search Atlas
368 9th Ave
New York, NY 10001
United States
Manick Bhan
+1-212-203-0986
https://searchatlas.com


































