Prompt Repetition Improves Non-Reasoning LLMs

When not using reasoning, repeating the input prompt improves performance for popular models (Gemini, GPT, Claude, and Deepseek) without increasing the number of generated tokens or latency.

Prompt Repetition LLMs are often trained as causal language models, i.e. past tokens cannot attend to future tokens. Therefore, the order of the tokens in a user’s query can affect prediction performance. For example, a query of the form “ ” often performs differently from a query of the form “ ” (see options-first vs. question-first in Figure 1). We propose to repeat the prompt, i.e. transform the input from “ ” to “ ”. This enables each prompt token to attend to every other prompt token, addressing the above. When not using reasoning, prompt repetition improves the performance of LLMs (Figure 1) without increasing the lengths of the generated outputs or latency

贡献者

这篇文章有帮助吗？

Prompt Repetition Improves Non-Reasoning LLMs

贡献者

最近更新