KNOWREF-60K
收藏arXiv2020-11-10 更新2024-06-21 收录
下载链接:
https://github.com/aemami1/KnowRef60k
下载链接
链接失效反馈官方服务:
资源简介:
KNOWREF-60K数据集由Mila/麦吉尔大学的研究人员创建,包含64,301个复杂的代词消歧问题,这些问题是从Reddit评论中自然发生的句子中提取的。数据集的创建过程包括多阶段的文本筛选和人工标注,确保了数据的质量和多样性。KNOWREF-60K旨在解决现有数据集中存在的重叠问题,提供一个更接近真实世界复杂性的评估平台,适用于测试和提升模型在常识推理任务上的性能。
KNOWREF-60K was created by researchers from Mila and McGill University. It contains 64,301 complex pronoun resolution problems extracted from naturally occurring sentences in Reddit comments. The dataset's construction process involves multi-stage text filtering and manual annotation, which ensures the quality and diversity of the data. KNOWREF-60K aims to address the overlap issues present in existing datasets, providing an evaluation platform that better aligns with real-world complexity for testing and improving model performance on commonsense reasoning tasks.
提供机构:
Mila/麦吉尔大学
创建时间:
2020-11-10



