IFEval-Hi
收藏魔搭社区2025-12-04 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/nv-community/IFEval-Hi
下载链接
链接失效反馈官方服务:
资源简介:
## Dataset Description:
The IFEval-Hi (Hindi IFEval) evaluation dataset contains 848 prompts in the Hindi language to evaluate the instruction-following ability of the large language models (LLMs). The dataset is constructed in a similar manner as the English version of IFEval, and the responses are verifiable by heuristics. Hence, these are "verifiable instructions”. The prompts are curated natively by specialists who are well-versed in Hindi and can cover the local nuances of the Hindi language and Indian culture.
This dataset is ready for commercial/non-commercial use. The evaluation steps are described [here](https://huggingface.co/datasets/nvidia/IFEval-Hi/blob/main/EVAL.md).
## Dataset Owner:
NVIDIA Corporation
## Dataset Creation Date:
April 2025
## License/Terms of Use:
CC-BY 4.0
## Intended Usage:
Evaluate the instruction-following ability of the LLM in the Hindi language.
## Dataset Characterization
Data Collection Method<br>
* Human <br>
Labeling Method<br>
* Not Applicable <br>
## Dataset Format
Text
## Dataset Quantification
688KB of query prompts, comprising 848 individual samples.
## Ethical Considerations:
NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
## Citing
If you find our work helpful, please consider citing our paper:
```
@article{kamath2025benchmarking,
title={Benchmarking Hindi LLMs: A New Suite of Datasets and a Comparative Analysis},
author={Kamath, Anusha and Singla, Kanishk and Paul, Rakesh and Joshi, Raviraj and Vaidya, Utkarsh and Chauhan, Sanjay Singh and Wartikar, Niranjan},
journal={arXiv preprint arXiv:2508.19831},
year={2025}
}
```
数据集说明:
IFEval-Hi(印地语版IFEval)评测数据集包含848条印地语提示词,用于评测大语言模型(Large Language Model,简称LLM)的指令遵循能力。该数据集的构建方式与英文版IFEval一致,其模型输出可通过启发式规则验证,因此这类提示被称为“可验证指令”。这些提示词均由精通印地语的本土专家精心编撰,能够覆盖印地语语言及印度文化的本土细节。
本数据集可用于商业与非商业场景,评测步骤详见[此处](https://huggingface.co/datasets/nvidia/IFEval-Hi/blob/main/EVAL.md)。
数据集所属方:英伟达公司(NVIDIA Corporation)
数据集创建日期:2025年4月
使用许可条款:CC-BY 4.0协议
预期用途:用于评测大语言模型的印地语指令遵循能力。
数据集特征:
数据采集方式:人工采集
标注方式:不适用
数据集格式:文本格式
数据集量化信息:包含688KB的查询提示词,共计848条独立样本。
伦理考量:
英伟达(NVIDIA)认为,可信人工智能是一项共同责任,我们已建立相关政策与实践规范,以支持各类人工智能应用的开发。开发者在按照本数据集的服务条款下载或使用本数据集时,应与内部模型团队协作,确保所使用的模型符合相关行业及应用场景的要求,并应对可能出现的产品误用问题。
若发现安全漏洞或英伟达人工智能相关问题,请[在此提交](https://www.nvidia.com/en-us/support/submit-security-vulnerability/)。
引用说明:
若您认为本数据集对您的研究有所帮助,请引用以下论文:
@article{kamath2025benchmarking,
title={Benchmarking Hindi LLMs: A New Suite of Datasets and a Comparative Analysis},
author={Kamath, Anusha and Singla, Kanishk and Paul, Rakesh and Joshi, Raviraj and Vaidya, Utkarsh and Chauhan, Sanjay Singh and Wartikar, Niranjan},
journal={arXiv预印本 arXiv:2508.19831},
year={2025}
}
提供机构:
maas
创建时间:
2025-10-09



