Novel Libraries in Stack Overflow Posts
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14186438
下载链接
链接失效反馈官方服务:
资源简介:
# Summary
We present datasets detailing the appearance of novel libraries and library pairs in Stack Overflow posts in 12 languages between 2008 and 2023.
# Disclaimer
Pair of libraries are displayed in the canonical format of | where lib_a precedes lib_b in alphabetical ordering.
Some of the examples are truncated for better readability.
GitHub source of the project: https://github.com/MeszarosGabor/SO_Post_Analyzer
# Descriptions
## ``/all_``_so_posts.jsonl
JSONL file that contains the raw extracted Stack Overflow fields. Within a single JSON object:
key: post_id,
values:
- post_type: 1 for question and 2 for answer
- accepted_answer_id
- date_posted
- score
- view_count
- code_snippets
- post_length
- poster_id
- last_actiivity
- tags
- number of comments
- number of answers
- parent id
Example:
```
{"72": ["1", "", "2008-08-01T13:38:27.133", "48", "2148", "I want to format my existing comments as 'RDoc comments' so they can be viewed using ri.\n\nWhat are some recommended resources for starting out using RDoc?\n", "25", "2016-12-30T06:56:18.310", "", "1", "2", ""]}
```
## ``/``_all_libs_dates.json
JSON file that lists the dates (with multiplicity, one for every post) when an individual library was mentioned in a post.
Example:
```
'FileUtils': ['2011-06-09',
'2011-07-01',
'2011-11-20',
'2011-11-20',
...
'2013-09-04',
'2020-05-08',
'2021-02-25']
```
## ``/``_all_pairs_dates.json
JSON file that lists the dates (with multiplicity, one for every post) when a pair of libraries was mentioned in a post.
Example:
```
'mongo_mapper|sinatra': ['2010-09-12',
'2011-12-30',
'2012-02-23',
'2012-09-04'],
```
## ``/``_libs_count.json
JSON file that lists the occurrence count of the individual libraries.
Example:
```
{
'cairo': 4,
'pango': 2,
'radix': 1,
}
```
## ``/``_pairs_count.json
JSON file that lists the co-occurrence count of the pairs of libraries.
Example:
```
'mongo_mapper|sinatra': 4,
'fileutils|getoptlong': 1,
'redis|rubygems': 24,
```
## ``/``_libs_first_dates.json
JSON file that lists the dates of the first appearances of individual libraries alongside the post id and poster id.
Example:
```
{
'cairo': {'id': '6242589', 'poster_id': '784674', 'date': '2011-06-05'},
}
```
## ``/``_pairs_first_dates.json
JSON file that lists the dates of the first co-appearances of pairs libraries alongside the post id and poster id.
Example:
```
'rubygems|server': {'id': '3748309',
'poster_id': '262808',
'date': '2010-09-20'
```
## ``/``_``_code_count_list.json
JSON file that contains a single list of library counts in the posts (in chronological order) that contain *at least one* library import.
## ``/``_daily_post_stats.json
JSON file that counts the number of posts on a given day, listed chronologically, containing dates *with at least one post*. Dictionary of key=date value=count(int) pairs.
Example:
```{...
'2011-09-03': 6,
'2011-09-04': 3,
'2011-09-05': 10,
'2011-09-06': 5,
'2011-09-07': 15,
...}
```
## ``/``_``_post_stats.json
JSON file that lists the individual post metadata (sorted by post date).
Fields:
- post id,
- post type,
- list of imports
- post date
- poster id
- score
Example:
```
{'id': '1892176',
'post_type': '1',
'imports': ['mechanize', 'rubygems'],
'date': '2009-12-12T03:31:43.823',
'poster_id': '124685',
'score': '5'},
```
## ``/``_time_based_new.jsonl
JSONL file that contains JSON objects (in chronological order) detailing post metadata.
Fields:
- post id,
- post date
- poster id (user id)
- post type,
- list of imports
- list of novel libraries in post
- list of novel pairs in post
Example:
```
{'post_id': '3543',
'post_date': '2008-08-06T15:24:00.787',
'user_id': '399',
'post_type': '2',
'imports': ['metric_fetcher', 'rake'],
'new_libs': ['metric_fetcher', 'rake'],
'new_pairs': ['metric_fetcher|rake']}
```
## ``/``_user_to_posts.json
JSON file that lists the post ids corresponding to a given user id. Keyed by user ids, values are list of post ids.
Example:
```
'303675': ['2941479'],
'348325': ['2945141', '2956990', '2968924', '3832703'],
'325477': ['2945228'],
'27196': ['2949100', '3177217'],
```
创建时间:
2024-11-20



