You know all of those reports about artificial intelligence models successfully passing the bar or achieving Ph.D.-level intelligence ? Looks like we should start taking those degrees back. A new study from researchers at the Oxford Internet Institute suggests that most of the popular benchmarking tools that are used to test AI performance are often unreliable and misleading.
Researchers looked at 445 different benchmark tests used by the industry and other academic outfits to test everything from reasoning capabilities to performance on coding tasks . Experts reviewed each benchmarking approach and found indications that the results produced by these tests may not be as accurate as they have been presented, due in part to vague definitions for what a benchmark is attempting to

Gizmodo

NBC News
TIME
WV News
IFL Science
Cowboy State Daily
Mohave Valley Daily News
Observer News Enterprise
Raw Story
Associated Press Elections