Paraphrase choice based on user traits
收藏DataCite Commons2025-06-01 更新2024-07-25 收录
下载链接:
https://figshare.com/articles/dataset/Paraphrase_choice_based_on_user_traits/1613525/2
下载链接
链接失效反馈官方服务:
资源简介:
PPDB paraphrase pairs and clusters with their associated usage score across three user traits:<br>a. Gender: male or female<br>b. Age: <25 or >30<br>c. Occupational Class: low or high
Contents:<br>frequencies.tar.gz - contains the raw frequency statistics for all phrases and each trait<br>pairs.tar.gz - contains files with pairwise usage scores for each trait<br>clusters.tar.gz - contains files with cluster usage scores for each trait
In pairs and clusters, the negative values are phrases which are more associated with: females, lower occupational class and users over 30 years old.
If you are using this dataset, please reference our work:
@inproceedings{paraphrase16aaai,<br>author = {Preo\c{t}iuc-Pietro, Daniel and Xu, Wei and Ungar, Lyle},<br>title = {{Discovering user attribute stylistic differences via paraphrasing}},<br>booktitle = {{Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence}},<br>series = {AAAI},<br>year = {2016}<br>}
PPDB(Paraphrase Database)释义对与簇集数据集,涵盖三类用户特征下的关联使用评分:
a. 性别:男性或女性
b. 年龄:25岁以下或30岁以上
c. 职业阶层:低职业阶层或高职业阶层
数据集包含以下文件:
frequencies.tar.gz - 包含所有短语及各用户特征的原始频率统计数据
pairs.tar.gz - 包含针对各用户特征的成对使用评分文件
clusters.tar.gz - 包含针对各用户特征的簇集使用评分文件
在成对数据与簇集数据中,负值表示与以下群体关联度更高的短语:女性、低职业阶层用户以及30岁以上人群。
若使用本数据集,请引用以下文献:
@inproceedings{paraphrase16aaai,
author = {Preoc{t}iuc-Pietro, Daniel and Xu, Wei and Ungar, Lyle},
title = {{通过释义挖掘用户属性的文体差异}},
booktitle = {{第三十届 AAAI 人工智能大会论文集}},
series = {AAAI},
year = {2016}
}
提供机构:
figshare
创建时间:
2016-01-20



