BhashaBench-Krishi
收藏魔搭社区2025-12-05 更新2025-08-16 收录
下载链接:
https://modelscope.cn/datasets/bharatgenai/BhashaBench-Krishi
下载链接
链接失效反馈官方服务:
资源简介:
<div align="center">
<img src="https://huggingface.co/bharatgenai/Param-1-2.9B-Instruct/resolve/main/BharatGen%20Logo%20(1).png" width="60%" alt="BharatGen" />
</div>
<hr>
<div align="center">
<a href="https://arxiv.org/abs/2510.25409" style="margin: 4px;">
<img alt="Paper" src="https://img.shields.io/badge/arXiv-2510.25409-b31b1b?style=flat" />
</a>
<a href="https://creativecommons.org/licenses/by/4.0/" target="_blank" style="margin: 4px;">
<img alt="License" src="https://img.shields.io/badge/License-CC--BY--4.0-blue.svg" />
</a>
<a href="https://bharatgen.com/bhashabench-blog/" target="_blank" style="margin: 4px;">
<img alt="Blog" src="https://img.shields.io/badge/Blog-Read%20More-brightgreen?style=flat" />
</a>
</div>
# BhashaBench-Krishi (BBK): Benchmarking AI on Indian Agricultural Knowledge
<div style="display: flex; gap: 5px;">
<a href="https://github.com/BharatGen-IITB-TIH/BhashaBench-Krishi"><img src="https://img.shields.io/badge/GITHUB-black?style=flat&logo=github&logoColor=white" alt="GitHub"></a>
<a href="#"><img src="https://img.shields.io/badge/Paper-Coming%20Soon-lightgrey?style=flat" alt="ArXiv"></a>
<a href="https://creativecommons.org/licenses/by/4.0/"><img src="https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg" alt="CC BY 4.0"></a>
</div>
## Overview
BhashaBench-Krishi (BBK) is the first large-scale, authentic benchmark designed to rigorously evaluate AI models on Indian agricultural knowledge. Tailored for India’s diverse agro-ecological zones, crops, languages, and farming practices, BBK draws from 55+ official government agricultural exams to assess models' ability to provide precise, region-aware, policy-relevant, and actionable agricultural advice.
## Key Features
- **Languages**: English and Hindi (with plans for more Indic languages)
- **Exams**: 55+ unique agricultural government and institutional exams across India
- **Domains**: 25+ agricultural and allied science domains, spanning over 270 topics
- **Questions**: 15,405 rigorously validated, exam-based questions
- **Difficulty Levels**: Easy (6,754), Medium (6,941), Hard (1,710)
- **Question Types**: Multiple Choice, Assertion-Reasoning, Match the Column, Rearrange the Sequence, Fill in the Blanks
- **Focus**: Practical, context-rich, region-specific agricultural knowledge essential for Indian farmers
## Dataset Statistics
| Metric | Count |
| ------------------------ | ------------------------- |
| Total Questions | 15,405 |
| English Questions | 12,648 |
| Hindi Questions | 2,757 |
| Subject Domains | 25+ |
| Government Exams Covered | 55+ |
## Dataset Structure
### Test Set
The test set consists of the BhashaBench-Krishi (BBK) benchmark, which contains approximately 15,405 multiple-choice questions across 2 Indic languages (English and Hindi).
We will add support for more Indic languages in upcoming versions.
### Subjects spanning BBK
| Subject Domain | Count |
|----------------------------------|-------|
| Agri-Environmental & Allied Disciplines | 176 |
| Agricultural Biotechnology | 524 |
| Agricultural Chemistry & Biochemistry | 281 |
| Agricultural Economics & Policy | 627 |
| Agricultural Engineering & Technology | 244 |
| Agricultural Extension Education | 774 |
| Agricultural Microbiology | 111 |
| Agriculture Communication | 254 |
| Agriculture Information Technology | 190 |
| Agronomy | 5078 |
| Animal Sciences | 148 |
| Crop Sciences | 549 |
| Dairy & Poultry Science | 89 |
| Entomology | 696 |
| Fisheries and Aquaculture | 34 |
| General Knowledge & Reasoning | 661 |
| Genetics and Plant Breeding | 389 |
| Horticulture | 2070 |
| Natural Resource Management | 193 |
| Nematology | 184 |
| Plant Pathology | 397 |
| Plant Sciences & Physiology | 129 |
| Seed Science and Technology | 202 |
| Soil Science | 1357 |
| Veterinary Sciences | 48 |
## Usage
Since this is a gated dataset, after your request for accessing the dataset is accepted, you can set your HuggingFace token:
```bash
export HF_TOKEN=YOUR_TOKEN_HERE
```
To load the BBK dataset for a Language:
```python
from datasets import load_dataset
language = 'Hindi'
# Use 'test' split for evaluation
split = 'test'
language_data = load_dataset("bharatgenai/BhashaBench-Krishi", data_dir=language, split=split, token=True)
print(language_data[0])
```
## Evaluation Results Summary
- **29+ models evaluated**, including GPT-4o, Qwen3-235B, and various open-source LLMs.
- **Top accuracy:**
- **English:** 70%+ by best models
- **Hindi:** 60–65%, indicating room for improvement
- **Strong domains:**
- Agricultural Biotechnology, Plant Sciences, Veterinary Sciences (~80% accuracy)
- **Weak domains:**
- Agri-Environmental Sciences, Nematology, Regional Crop Management (<50%)
- **Challenges:**
- Hard questions and non-MCQ formats remain challenging across models
For detailed results and analysis, please refer to our [blog](https://bharatgen.com/bhashabench-blog/).
## Citation
Please cite our benchmark if used in your work:
```bibtex
@misc{devane2025bhashabenchv1comprehensivebenchmark,
title={BhashaBench V1: A Comprehensive Benchmark for the Quadrant of Indic Domains},
author={Vijay Devane and Mohd Nauman and Bhargav Patel and Aniket Mahendra Wakchoure and Yogeshkumar Sant and Shyam Pawar and Viraj Thakur and Ananya Godse and Sunil Patra and Neha Maurya and Suraj Racha and Nitish Kamal Singh and Ajay Nagpal and Piyush Sawarkar and Kundeshwar Vijayrao Pundalik and Rohit Saluja and Ganesh Ramakrishnan},
year={2025},
eprint={2510.25409},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2510.25409},
}
```
## License
This dataset is released under the [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/).
## Contact
For any questions or feedback, please contact:
- Vijay Devane (vijay.devane@tihiitb.org)
- Mohd. Nauman (mohd.nauman@tihiitb.org)
- Bhargav Patel (bhargav.patel@tihiitb.org)
- Kundeshwar Pundalik (kundeshwar.pundalik@tihiitb.org)
## Links
- [GitHub Repository 💻](https://github.com/BharatGen-IITB-TIH/BhashaBench-Krishi)
- [Paper 📄](https://arxiv.org/abs/2510.25409)
<div align="center">
<img src="https://huggingface.co/bharatgenai/Param-1-2.9B-Instruct/resolve/main/BharatGen%20Logo%20(1).png" width="60%" alt="BharatGen标志" />
</div>
<hr>
<div align="center">
<a href="https://arxiv.org/abs/2510.25409" style="margin: 4px;">
<img alt="论文" src="https://img.shields.io/badge/arXiv-2510.25409-b31b1b?style=flat" />
</a>
<a href="https://creativecommons.org/licenses/by/4.0/" target="_blank" style="margin: 4px;">
<img alt="许可协议" src="https://img.shields.io/badge/License-CC--BY--4.0-blue.svg" />
</a>
<a href="https://bharatgen.com/bhashabench-blog/" target="_blank" style="margin: 4px;">
<img alt="博客" src="https://img.shields.io/badge/Blog-Read%20More-brightgreen?style=flat" />
</a>
</div>
# BhashaBench-Krishi(BBK):面向印度农业知识的人工智能基准测试
<div style="display: flex; gap: 5px;">
<a href="https://github.com/BharatGen-IITB-TIH/BhashaBench-Krishi"><img src="https://img.shields.io/badge/GITHUB-black?style=flat&logo=github&logoColor=white" alt="GitHub"></a>
<a href="#"><img src="https://img.shields.io/badge/Paper-Coming%20Soon-lightgrey?style=flat" alt="ArXiv"></a>
<a href="https://creativecommons.org/licenses/by/4.0/"><img src="https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg" alt="CC BY 4.0"></a>
</div>
## 概览
BhashaBench-Krishi(BBK)是首个大规模、权威的基准测试集,旨在严格评估人工智能模型在印度农业知识领域的表现。该基准针对印度多样化的农业生态区、作物、语言与耕作习惯量身打造,源自55余项官方政府农业考试,用于评估模型提供精准、贴合区域实际、符合政策导向且具备实操价值的农业咨询的能力。
## 核心特性
- **语言支持**:英语与印地语(计划拓展至更多印度本土语言)
- **考试来源**:覆盖印度境内55+独具特色的官方农业类考试与机构考核
- **领域覆盖**:涵盖25+个农业及相关科学领域,涉及超270个细分主题
- **题目规模**:15405道经过严格验证的考试类标准化题目
- **难度分级**:简单题(6754道)、中等题(6941道)、难题(1710道)
- **题型类型**:选择题、断言-推理题、匹配题、序列重组题、填空题
- **核心聚焦**:面向印度农民不可或缺的实用、语境丰富且贴合区域特色的农业专业知识
## 数据集统计数据
| 指标项 | 数值 |
| ------------------------ | ------------------------- |
| 总题目数 | 15,405 |
| 英语题目 | 12,648 |
| 印地语题目 | 2,757 |
| 学科领域数 | 25+ |
| 覆盖官方考试数 | 55+ |
## 数据集结构
### 测试集
本测试集即BhashaBench-Krishi(BBK)基准,包含约15405道选择题,覆盖2种印度本土语言(英语与印地语)。我们将在后续版本中增加对更多印度本土语言的支持。
### BBK覆盖的学科领域
| 学科领域 | 题目数量 |
|----------------------------------|-------|
| 农业环境及相关学科 | 176 |
| 农业生物技术 | 524 |
| 农业化学与生物化学 | 281 |
| 农业经济与政策 | 627 |
| 农业工程与技术 | 244 |
| 农业推广教育 | 774 |
| 农业微生物学 | 111 |
| 农业传播学 | 254 |
| 农业信息技术 | 190 |
| 农艺学 | 5078 |
| 动物科学 | 148 |
| 作物科学 | 549 |
| 乳品与家禽科学 | 89 |
| 昆虫学 | 696 |
| 渔业与水产养殖 | 34 |
| 常识与推理 | 661 |
| 遗传与植物育种 | 389 |
| 园艺学 | 2070 |
| 自然资源管理 | 193 |
| 线虫学 | 184 |
| 植物病理学 | 397 |
| 植物科学与生理学 | 129 |
| 种子科学与技术 | 202 |
| 土壤科学 | 1357 |
| 兽医学 | 48 |
## 使用方法
由于本数据集为权限申请制(gated)数据集,在您的访问申请获批后,可通过如下命令设置HuggingFace令牌:
bash
export HF_TOKEN=YOUR_TOKEN_HERE
加载指定语言的BBK数据集的代码示例如下:
python
from datasets import load_dataset
language = 'Hindi'
# 使用'test'划分进行模型评估
split = 'test'
language_data = load_dataset("bharatgenai/BhashaBench-Krishi", data_dir=language, split=split, token=True)
print(language_data[0])
## 评估结果汇总
- 已完成29+款模型的评估,涵盖GPT-4o、Qwen3-235B以及多款开源大语言模型。
- 最高准确率表现:
- 英语赛道:最优模型准确率可达70%以上
- 印地语赛道:准确率区间为60%~65%,仍存在较大提升空间
- 表现优异的领域:
- 农业生物技术、植物科学、兽医学(准确率约80%)
- 表现薄弱的领域:
- 农业环境科学、线虫学、区域作物管理(准确率低于50%)
- 现存挑战:
- 难题与非选择题型仍是各类模型的共性难点
如需查阅详细结果与分析,请访问我们的[博客](https://bharatgen.com/bhashabench-blog/)。
## 引用格式
若您的工作中使用了本基准测试集,请引用如下文献:
bibtex
@misc{devane2025bhashabenchv1comprehensivebenchmark,
title={BhashaBench V1: A Comprehensive Benchmark for the Quadrant of Indic Domains},
author={Vijay Devane and Mohd Nauman and Bhargav Patel and Aniket Mahendra Wakchoure and Yogeshkumar Sant and Shyam Pawar and Viraj Thakur and Ananya Godse and Sunil Patra and Neha Maurya and Suraj Racha and Nitish Kamal Singh and Ajay Nagpal and Piyush Sawarkar and Kundeshwar Vijayrao Pundalik and Rohit Saluja and Ganesh Ramakrishnan},
year={2025},
eprint={2510.25409},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2510.25409},
}
## 许可协议
本数据集采用[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)许可协议发布。
## 联系方式
如有任何疑问或反馈,请联系:
- Vijay Devane (vijay.devane@tihiitb.org)
- Mohd. Nauman (mohd.nauman@tihiitb.org)
- Bhargav Patel (bhargav.patel@tihiitb.org)
- Kundeshwar Pundalik (kundeshwar.pundalik@tihiitb.org)
## 相关链接
- [GitHub仓库 💻](https://github.com/BharatGen-IITB-TIH/BhashaBench-Krishi)
- [论文 📄](https://arxiv.org/abs/2510.25409)
提供机构:
maas
创建时间:
2025-08-09
搜集汇总
数据集介绍

背景与挑战
背景概述
BhashaBench-Krishi (BBK) 是一个大规模、真实的基准数据集,专门用于严格评估AI模型在印度农业知识上的表现。它基于55多个官方政府农业考试,包含15,405个经过验证的问题,覆盖英语和印地语,涉及25多个农业领域,旨在提供精准、区域感知和可操作的农业建议。
以上内容由遇见数据集搜集并总结生成



