five

habedi/stack-exchange-dataset

收藏
Hugging Face2023-11-29 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/habedi/stack-exchange-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc task_categories: - text-classification - question-answering language: - en size_categories: - 10K<n<100K pretty_name: Stack Exchange -- Question Dataset --- This dataset consists of three CSV files, namely: 'cs.csv', 'ds.csv', and 'p.csv'. Each CSV file includes the data for the questions asked on a Stack Exchange (SE) question-answering community, from the creation of the community until May 2021. - 'cs.csv' --> [Computer Science SE](https://cs.stackexchange.com/) - 'ds.csv' --> [Data Science SE](https://datascience.stackexchange.com/) - 'p.csv' --> [Political Science SE](https://politics.stackexchange.com/) Each CSV file has the following columns: - `id`: the question id - `title`: the title of the question - `body`: the body or text of the question - `tags`: the list of tags assigned to the question - `label`: a label indicating whether the question is resolved or not (0: not resolved; 1: resolved) The dataset was used in these researches: - [A deep learning-based approach for identifying unresolved questions on Stack Exchange Q&A communities through graph-based communication modelling](https://doi.org/10.1007/s41060-023-00454-0) - [Survival analysis for user disengagement prediction: question-and-answering communities’ case](https://doi.org/10.1007/s13278-022-00914-8)
提供机构:
habedi
原始信息汇总

数据集概述

基本信息

  • 许可证: cc
  • 任务类别:
    • 文本分类
    • 问答
  • 语言: 英语
  • 数据集大小: 10K<n<100K
  • 数据集名称: Stack Exchange -- Question Dataset

数据文件

  • 文件列表:
    • cs.csv
    • ds.csv
    • p.csv

数据内容

数据结构

  • 列信息:
    • id: 问题ID
    • title: 问题标题
    • body: 问题内容
    • tags: 问题标签列表
    • label: 问题是否已解决的标签(0: 未解决; 1: 已解决)

研究应用

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作