Simple Rephrases Can Confuse Large Language Models

February 16th, 2026

Written by Buyun Liang, PhD Student in Computer and Information Science

People naturally express the same idea in different ways. A student might ask a math question, then restate it to be clearer. A doctor might explain a diagnosis using simpler language. For us, changing the wording doesn’t change the meaning. 

For large language models (LLMs), however, that isn’t always the case. For example, the figure above shows a simple math problem that the LLM answers correctly in its original form, but incorrectly after the question is rewritten.

A group of researchers in Prof. René Vidal’s group at the University of Pennsylvania’s GRASP Laboratory explored this problem in their recent publication titled “SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations”, which was accepted to the 2025 Annual Conference on Neural Information Processing Systems (NeurIPS 2025). The group is comprised of Buyun Liang (PhD, CIS), Liangzu Peng (PhD, ESE), Jinqi Luo (PhD, CIS), Darshan Thaker (PhD, CIS), Kwan Ho Ryan Chan (PhD, ESE), and Prof. René Vidal. 

The researchers found that LLMs often correctly answer a question, but produce a confident — and completely wrong — response when it is reworded. Nothing about the meaning changes; only the wording does. Yet that change can lead the model to “hallucinate” incorrect information.

In everyday use, this may seem like a minor glitch. But in high-stakes settings such as healthcare, finance, education, or legal assistance, the consequences could be serious. If an LLM flips its answer just because a question is phrased differently, it raises major concerns about reliability. 

Traditionally, researchers “red-team” language models by deliberately trying to fool them with misleading or unnatural inputs to uncover weaknesses. This work takes a different approach. Instead of using unrealistic prompts, it asks: What happens when the same idea is expressed differently?

To systematically investigate this question, the researchers developed a new method called SECA. It generates natural rewordings of the same question and tests how the LLM responds. In many cases, SECA can find a simple rephrasing that causes the LLM to fail, even though the meaning remains unchanged.

Across a wide range of LLMs, both open-source and commercial, the pattern was consistent. Slightly longer or more detailed rephrasings were often enough to trigger incorrect answers, even though the underlying question had not changed.

This work highlights the importance of everyday language variation. If LLMs are going to be trusted in the real world, they need to handle language the way people do — understanding that meaning stays the same even when the words change. The researchers see this as a necessary shift in how future LLMs are designed, moving toward robustness built on a deeper understanding of language, rather than just relying on rigid safeguards.

For further details, see the original paper: https://arxiv.org/abs/2510.04398.