Universal Dependencies Probing Tasks
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/AIRI-Institute/Probing_framework
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个庞大的多语言探测任务集合,包含了1927个任务,覆盖了104种语言。这些任务旨在评估多语言转换模型mBERT和XLM-R的能力。数据集还包括源自通用依赖关系的任务,重点关注形态句法特征,从而能够对多语言模型进行评估。该数据集的规模涉及104种语言和80个形态句法特征,其任务目标是跨多种语言探测语言特征。
This dataset constitutes a large-scale multilingual probing task collection, encompassing 1927 tasks spanning 104 distinct languages. These tasks are designed to assess the capabilities of two prominent multilingual transformer models, mBERT and XLM-R. Furthermore, the dataset includes tasks derived from universal dependency frameworks, with a focus on morphosyntactic features, thereby enabling comprehensive evaluation of multilingual models. Covering 104 languages and 80 morphosyntactic features, the core objective of the tasks within this dataset is to probe linguistic features across multiple languages.
提供机构:
Universal Dependencies



