AI4Bharat, a research lab at Indian Institute of Technology (IIT) Madras, has introduced a new, open-source benchmark test specifically designed to assess the performance of large language models (LLMs) on Indian languages as well as Indian context and safety.

Developed with support from Google Cloud, the Indic LLM-Arena benchmark is a crowd-sourced platform that evaluates LLMs on the basis of votes cast by thousands of anonymous users. The models are then ranked on a “human-in-the-loop” leaderboard, AI4Bharat said in a blog post on Monday, November 10.

Currently, Indic LLM-Arena supports only text-based inputs across multiple Indian languages and code-mix scenarios. However, AI4Bharat said it has plans of expanding the benchmark to cover omni models with vision and audio capabilities

See Full Page