|
|
--- |
|
|
title: BhashaBench Leaderboard |
|
|
emoji: π |
|
|
colorFrom: blue |
|
|
colorTo: red |
|
|
sdk: docker |
|
|
hf_oauth: true |
|
|
pinned: true |
|
|
license: apache-2.0 |
|
|
duplicated_from: open-llm-leaderboard/open_llm_leaderboard |
|
|
short_description: Evaluating LLMs on BhashaBench tasks |
|
|
tags: |
|
|
- leaderboard |
|
|
- modality:text |
|
|
- submission:manual |
|
|
- test:public |
|
|
- judge:function |
|
|
- eval:generation |
|
|
- language:English, Hindi |
|
|
- domain:Ayur, Krishi, Finance, Legal |
|
|
--- |
|
|
https://arxiv.org/abs/2510.25409 |
|
|
|
|
|
# Open LLM Leaderboard |
|
|
|
|
|
Modern React interface for comparing Large Language Models (LLMs) in an open and reproducible way. |
|
|
|
|
|
## Features |
|
|
|
|
|
- π Interactive table with advanced sorting and filtering |
|
|
- π Semantic model search |
|
|
- π Pin models for comparison |
|
|
- π± Responsive and modern interface |
|
|
- π¨ Dark/Light mode |
|
|
- β‘οΈ Optimized performance with virtualization |
|
|
|
|
|
## Architecture |
|
|
|
|
|
The project is split into two main parts: |
|
|
|
|
|
### Frontend (React) |
|
|
|
|
|
``` |
|
|
frontend/ |
|
|
βββ src/ |
|
|
β βββ components/ # Reusable UI components |
|
|
β βββ pages/ # Application pages |
|
|
β βββ hooks/ # Custom React hooks |
|
|
β βββ context/ # React contexts |
|
|
β βββ constants/ # Constants and configurations |
|
|
βββ public/ # Static assets |
|
|
βββ server.js # Express server for production |
|
|
``` |
|
|
|
|
|
### Backend (FastAPI) |
|
|
|
|
|
``` |
|
|
backend/ |
|
|
βββ app/ |
|
|
β βββ api/ # API router and endpoints |
|
|
β β βββ endpoints/ # Specific API endpoints |
|
|
β βββ core/ # Core functionality |
|
|
β βββ config/ # Configuration |
|
|
β βββ services/ # Business logic services |
|
|
β βββ leaderboard.py |
|
|
β βββ models.py |
|
|
β βββ votes.py |
|
|
β βββ hf_service.py |
|
|
βββ utils/ # Utility functions |
|
|
``` |
|
|
|
|
|
## Technologies |
|
|
|
|
|
### Frontend |
|
|
|
|
|
- React |
|
|
- Material-UI |
|
|
- TanStack Table & Virtual |
|
|
- Express.js |
|
|
|
|
|
### Backend |
|
|
|
|
|
- FastAPI |
|
|
- Hugging Face API |
|
|
- Docker |
|
|
|
|
|
## Development |
|
|
|
|
|
The application is containerized using Docker and can be run using: |
|
|
|
|
|
```bash |
|
|
docker-compose up |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
Please cite our benchmark if used in your work: |
|
|
|
|
|
```bibtex |
|
|
@misc{devane2025bhashabenchv1comprehensivebenchmark, |
|
|
title={BhashaBench V1: A Comprehensive Benchmark for the Quadrant of Indic Domains}, |
|
|
author={Vijay Devane and Mohd Nauman and Bhargav Patel and Aniket Mahendra Wakchoure and Yogeshkumar Sant and Shyam Pawar and Viraj Thakur and Ananya Godse and Sunil Patra and Neha Maurya and Suraj Racha and Nitish Kamal Singh and Ajay Nagpal and Piyush Sawarkar and Kundeshwar Vijayrao Pundalik and Rohit Saluja and Ganesh Ramakrishnan}, |
|
|
year={2025}, |
|
|
eprint={2510.25409}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CL}, |
|
|
url={https://arxiv.org/abs/2510.25409}, |
|
|
} |
|
|
``` |