13ari/Sanskriti
收藏Hugging Face2025-10-13 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/13ari/Sanskriti
下载链接
链接失效反馈官方服务:
资源简介:
SANSKRITI基准是最大的数据集,旨在评估语言模型对印度丰富文化多样性的理解和推理能力。它解决了对文化意识基准的关键需求,因为语言模型的全球有效性取决于其对当地社会文化背景的理解。数据集包含21,853个精心策划的问题-答案对,涵盖印度所有28个邦和8个中央直辖区,以及十六个关键文化属性,如仪式和典礼、历史、旅游、美食、舞蹈和音乐、服装、语言、艺术、节日、宗教、医学、交通、体育、夜生活和个人。问题分为四种类型:国家预测、基于关联、一般意识和邦预测,并在这些类别中显示出平衡的分布。数据集还用于评估领先的大型语言模型(LLMs)、小型语言模型(SLMs)和印度语言模型(ILMs),采用零样本、多项选择题(MCQ)格式进行,准确性作为唯一指标。
The SANSKRITI benchmark is the largest dataset created to evaluate Language Models (LMs) comprehension and reasoning capabilities regarding the rich cultural diversity of India. It addresses the critical need for culturally-aware benchmarks, as the global effectiveness of LMs depends on their understanding of local socio-cultural contexts. The dataset comprises 21,853 meticulously curated question-answer pairs, spanning all 28 states and 8 union territories of India, and covering sixteen key attributes of Indian culture, such as rituals and ceremonies, history, tourism, cuisine, dance and music, costume, language, art, festivals, religion, medicine, transport, sports, nightlife, and personalities. The questions are systematically categorized into four types: Country Prediction, Association Based, General Awareness, and State Prediction, showing a balanced distribution across these categories. The dataset was used to evaluate leading Large Language Models (LLMs), Small Language Models (SLMs), and Indic Language Models (ILMs) using a zero-shot, multiple-choice question (MCQ) format, with accuracy as the sole metric.
提供机构:
13ari



