five

FrancophonIA/KRoQ

收藏
Hugging Face2025-03-30 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/FrancophonIA/KRoQ
下载链接
链接失效反馈
官方服务:
资源简介:
Konstanz问题资源(KRoQ)是一个依赖解析的多语言并行语料库,包含信息寻求型和非信息寻求型问题。该语料库通过使用基于语言学动机的规则系统构建,该系统能够利用一种语言中的语言线索来帮助跨语言分类和注释问题。目前的语料库包括德语、法语、西班牙语和古希腊语。通过识别的基于语言学的启发式方法,一个两步骤评分机制将每个问题分类为信息寻求型或非信息寻求型。该数据集旨在为问题分类领域的研究提供基础,可以作为机器学习算法的训练和测试数据,也可以作为理论语言学研究的语料库数据或进一步基于规则的问答识别资源的基础。

The Konstanz Resource of Questions (KRoQ) is a dependency-parsed, parallel multilingual corpus of information-seeking and non-information-seeking questions. The corpus is constructed using a linguistically motivated rule-based system that utilizes linguistic cues from one language to help classify and annotate questions across other languages. The current corpus includes German, French, Spanish, and Koine Greek. A two-step scoring mechanism based on identified linguistically motivated heuristics classifies each question as either information seeking or non-information seeking. The dataset is released to serve as a basis for further work in the area of question classification and can be used as training and testing data for machine-learning algorithms, as corpus data for theoretical linguistic research, or as a resource for further rule-based approaches to question identification.
提供机构:
FrancophonIA
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作