Python’s Evolution on Stack Overflow: An Empirical Analysis of Topic Trends
收藏Figshare2025-04-01 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Raw_Data/28703654/2
下载链接
链接失效反馈官方服务:
资源简介:
This is the experiment data of the thesis "Python's Evolution on Stack Overflow: An Empirical Analysis of Topic Trends". This dataset is from the open source "Stack Exchange Data Dump", which shares the latest posts information monthly. The data downloaded for this study was published on September 5, 2023.Raw DataThis is the raw data of "Stack Exchange Data Dump". We just separate them by year. The original data source is from : <b>h</b><b>ttps://archive.org/details/stackexchange</b>Training DataEach set contains two files, Training Set and Rater, Training Set is the content of the training set, i.e., the file in which the model is directly trained, and the Grade column is the result of the machine's prediction.The Rater file contains the data of the manual scoring, and contains the following sections:1. the rating data for each rater, with the specific column name Rater- rater number (e.g. Rater-1)2. the machine-predicted rating Predicted-Grade3. the average of the manual ratings (after removing the highest and lowest)Predicted Data(xlsx)Inside is the result of each year after heat calculation, some of the column names are the original column names in the xml file, some of the column names are added for better prediction of the heat value:BodyLength: The length of post content.CreateDate: The date calculation is converted to the number of days from column 'CreationDate' to September 30, 2023 (i.e., the last day of the month in which the dataset was published) to avoid the date of some posts being converted to zero.ActiveDate: The date calculation is converted to the number of days from column 'LastActivityDate' to September 30, 2023 (i.e., the last day of the month in which the dataset was published) to avoid the date of some posts being converted to zero.Score-1: The result of dividing the total number of 'Score' by the 'CreateDate' (i.e., the value of scores earned per day)Predicted_Grade: Grades predicted for each postGradeScore: Each grade's value<br>
提供机构:
Fengqi, Hu
创建时间:
2025-04-01



