five

Datasets of Expert Recommendation in Community Question Answering

收藏
DataCite Commons2025-04-27 更新2025-04-16 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=41c073025c7541b699bfaa132b771892
下载链接
链接失效反馈
官方服务:
资源简介:
The Stack Exchange public dataset contains real interactive data from various question-answering communities, which is commonly used in CQA expert recommendation. This study selects data from the Academia sub-community, which is an academic communication community with university teachers and graduate students as main users. The data generated by users who have provided at least one best answer from February 14, 2012 to December 31, 2019 are selected as the training dataset. 1920 questions that have been answered by at least two users in the training dataset from January 1, 2020 to September 25, 2022 are selected as the testing dataset.Due to the important role of the question information to which the answer belongs in identifying the topic of the answer, question information is added as the contextual extension text of the answer when training the BERT-LLDA model. The question information includes the question title, queation body, and question tags.This study combines the questioner ID and respondent ID in the Q&A data into a user pair and stores them in a CSV file. In order to consider the user quality factor in PageRank, this study proposes the user quality weight based on the proportion of votes and best answers.
提供机构:
Science Data Bank
创建时间:
2023-11-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作