Aunsiels/Quasimodo
收藏数据集概述:Quasimodo
数据集描述
数据集总结
Quasimodo是一个自动从问答论坛和查询日志中构建的常识知识库。
支持的任务和排行榜
适用于需要外部知识支持的任务,如问答。
语言
英语
数据集结构
数据实例
python { "subject": "elephant", "predicate": "has_body_part", "object": "trunk", "modality": "TBC[so long trunks] x#x2 // TBC[long trunks] x#x9 // TBC[big trunks] x#x6 // TBC[long trunk] x#x1 // TBC[such big trunks] x#x1 0 0.9999667967035647 elephants have trunks x#x34 x#xGoogle Autocomplete, Bing Autocomplete, Yahoo Questions, Answers.com Questions, Reddit Questions // a elephants have trunks x#x2 x#xGoogle Autocomplete // a elephant have a trunk x#x2 x#xGoogle Autocomplete // elephants have so long trunks x#x2 x#xGoogle Autocomplete // elephants have long trunks x#x8 x#xGoogle Autocomplete, Yahoo Questions, Answers.com Questions // elephants have big trunks x#x6 x#xGoogle Autocomplete, Answers.com Questions, Reddit Questions // elephants have trunk x#x3 x#xGoogle Autocomplete, Yahoo Questions // elephant have long trunks x#x1 x#xGoogle Autocomplete // elephant has a trunk x#x1 x#xGoogle Autocomplete // elephants have a trunk x#x2 x#xAnswers.com Questions // an elephant has a long trunk x#x1 x#xAnswers.com Questions // elephant have trunks x#x1 x#xAnswers.com Questions // elephants have such big trunks x#x1 x#xReddit Questions", "score": 0.9999667967668732, "local_sigma": 1.0 }
数据字段
- subject: 三元组的主题
- predicate: 三元组的谓词
- object: 三元组的对象
- modality: 与三元组关联的模态及其计数。TBC表示对象可以进一步细化为列出的对象
- is_negative: 如果声明被否定,则为1
- score: 监督评分模型的显著性得分
- local_sigma: 观察特定主题的(谓词, 对象)的严格条件概率。即,衡量声明的唯一性。例如,local_sigma(lawyers, defend, serial_killers) = 1, local_sigma(lawyers, make, money) = 0.01,尽管两个声明的得分相似,均为0.99。
数据集创建
详见原始论文。
附加信息
许可信息
CC-BY 2.0
引用信息
Romero et al., Commonsense Properties from Query Logs and Question Answering Forums, CIKM, 2019



