Researchers with Scale AI have found that search-based AI models may cheat on benchmark tests by fetching the answers directly from online sources rather than deriving those answers through a "reasoning" process.
Scale AI computer scientists Ziwen Han, Meher Mankikar, Julian Michael, and Zifan Wang refer to the phenomenon as "Search-Time Data Contamination," which they describe in a paper published to the AI data provider's website.
On their own, AI models suffer from a significant limitation: They're trained at a specific point in time on a limited set of data and thus lack information about anything after that training data cut-off date.
So to better handle inquiries about current events, firms like Anthropic, Google, OpenAI, and Perplexity have integrated search capabilities into the