five

Python’s Evolution on Stack Overflow: An Empirical Analysis of Topic Trends

收藏
Figshare2025-04-01 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Raw_Data/28703654/2
下载链接
链接失效反馈
官方服务:
资源简介:
This is the experiment data of the thesis "Python's Evolution on Stack Overflow: An Empirical Analysis of Topic Trends". This dataset is from the open source "Stack Exchange Data Dump", which shares the latest posts information monthly. The data downloaded for this study was published on September 5, 2023.Raw DataThis is the raw data of "Stack Exchange Data Dump". We just separate them by year. The original data source is from : <b>h</b><b>ttps://archive.org/details/stackexchange</b>Training DataEach set contains two files, Training Set and Rater, Training Set is the content of the training set, i.e., the file in which the model is directly trained, and the Grade column is the result of the machine's prediction.The Rater file contains the data of the manual scoring, and contains the following sections:1. the rating data for each rater, with the specific column name Rater- rater number (e.g. Rater-1)2. the machine-predicted rating Predicted-Grade3. the average of the manual ratings (after removing the highest and lowest)Predicted Data(xlsx)Inside is the result of each year after heat calculation, some of the column names are the original column names in the xml file, some of the column names are added for better prediction of the heat value:BodyLength: The length of post content.CreateDate: The date calculation is converted to the number of days from column 'CreationDate' to September 30, 2023 (i.e., the last day of the month in which the dataset was published) to avoid the date of some posts being converted to zero.ActiveDate: The date calculation is converted to the number of days from column 'LastActivityDate' to September 30, 2023 (i.e., the last day of the month in which the dataset was published) to avoid the date of some posts being converted to zero.Score-1: The result of dividing the total number of 'Score' by the 'CreateDate' (i.e., the value of scores earned per day)Predicted_Grade: Grades predicted for each postGradeScore: Each grade's value<br>
提供机构:
Fengqi, Hu
创建时间:
2025-04-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作