five

billli/QuRe

收藏
Hugging Face2024-07-03 更新2024-07-06 收录
下载链接:
https://hf-mirror.com/datasets/billli/QuRe
下载链接
链接失效反馈
官方服务:
资源简介:
广义量词(例如,few, most)用于表示谓词满足的比例。QuRe是一个量化推理数据集,来源于论文《Pragmatic Reasoning Unlocks Quantifier Semantics for Foundation Models》。该数据集包含了来自Wikipedia的真实句子和英语使用者对广义量词的人工注释。

The QuRe dataset is a quantifier reasoning dataset that includes real-world sentences from Wikipedia and human annotations of generalized quantifiers from English speakers. Each sample in the dataset provides detailed information about the original sentence, the mentioned percentage, the position of the percentage in the sentence, the generated mathematical expression, the annotated quantified sentence, the position of the quantifier in the sentence, the difficulty of deciphering the percentage scope, the related Wikipedia entity, and the topics of the sentence.
提供机构:
billli
原始信息汇总

数据集概述

  • 名称: QuRe
  • 许可证: Apache 2.0
  • 语言: 英语
  • 标签:
    • 自然语言处理
    • 广义量词
    • 量词推理
  • 规模: n<1K

数据集介绍

QuRe 是一个用于量词推理的数据集,源自论文《Pragmatic Reasoning Unlocks Quantifier Semantics for Foundation Models》。该数据集包含来自维基百科的真实句子以及英语使用者对广义量词的人工标注。

数据样本

json { "orig_sentence": "In order for a steel to be considered stainless it must have a Chromium content of at least 10.5%.", "percentage": "10.50%", "percentage_index": 0, "math_expr": ">=0.105", "quant_sent": "In order for a steel to be considered stainless it must have some Chromium content.", "quantifier": "some", "quantifier_position": 12, "specificity": "unable", "wiki_entity": "List of blade materials", "topics": "metallurgy; steel; composition" }

  • orig_sentence: 维基百科中出现的原始句子。
  • percentage: 原始句子中提到的百分比。
  • percentage_index: 百分比在原始句子中的索引位置。
  • math_expr: 生成的百分比表达式。
  • quant_sent: 标注后的量化句子。
  • quantifier_position: 量词在句子中的位置。
  • specificity: 从句子中排除量词后,解析量词百分比范围的难度。
  • wiki_entity: 包含原始句子的维基百科实体。
  • topics: 句子的主题。

数据集加载

python from datasets import load_dataset

ds = load_dataset("billli/QuRe")

参考文献

@inproceedings{li-etal-2023-pragmatic, title = "Pragmatic Reasoning Unlocks Quantifier Semantics for Foundation Models", author = "Li, Yiyuan and Menon, Rakesh and Ghosh, Sayan and Srivastava, Shashank", editor = "Bouamor, Houda and Pino, Juan and Bali, Kalika", booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing", month = dec, year = "2023", address = "Singapore", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2023.emnlp-main.38", pages = "573--591", }

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作