five

Novel Libraries in Stack Overflow Posts

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14186438
下载链接
链接失效反馈
官方服务:
资源简介:
# Summary We present datasets detailing the appearance of novel libraries and library pairs in Stack Overflow posts in 12 languages between 2008 and 2023. # Disclaimer Pair of libraries are displayed in the canonical format of | where lib_a precedes lib_b in alphabetical ordering. Some of the examples are truncated for better readability. GitHub source of the project: https://github.com/MeszarosGabor/SO_Post_Analyzer # Descriptions ## ``/all_``_so_posts.jsonl JSONL file that contains the raw extracted Stack Overflow fields. Within a single JSON object: key: post_id, values: - post_type: 1 for question and 2 for answer - accepted_answer_id - date_posted - score - view_count - code_snippets - post_length - poster_id - last_actiivity - tags - number of comments - number of answers - parent id Example: ``` {"72": ["1", "", "2008-08-01T13:38:27.133", "48", "2148", "I want to format my existing comments as 'RDoc comments' so they can be viewed using ri.\n\nWhat are some recommended resources for starting out using RDoc?\n", "25", "2016-12-30T06:56:18.310", "", "1", "2", ""]} ``` ## ``/``_all_libs_dates.json JSON file that lists the dates (with multiplicity, one for every post) when an individual library was mentioned in a post. Example: ``` 'FileUtils': ['2011-06-09', '2011-07-01', '2011-11-20', '2011-11-20', ... '2013-09-04', '2020-05-08', '2021-02-25'] ``` ## ``/``_all_pairs_dates.json JSON file that lists the dates (with multiplicity, one for every post) when a pair of libraries was mentioned in a post. Example: ``` 'mongo_mapper|sinatra': ['2010-09-12', '2011-12-30', '2012-02-23', '2012-09-04'], ``` ## ``/``_libs_count.json JSON file that lists the occurrence count of the individual libraries. Example: ``` { 'cairo': 4, 'pango': 2, 'radix': 1, } ``` ## ``/``_pairs_count.json JSON file that lists the co-occurrence count of the pairs of libraries. Example: ``` 'mongo_mapper|sinatra': 4, 'fileutils|getoptlong': 1, 'redis|rubygems': 24, ``` ## ``/``_libs_first_dates.json JSON file that lists the dates of the first appearances of individual libraries alongside the post id and poster id. Example: ``` { 'cairo': {'id': '6242589', 'poster_id': '784674', 'date': '2011-06-05'}, } ``` ## ``/``_pairs_first_dates.json JSON file that lists the dates of the first co-appearances of pairs libraries alongside the post id and poster id. Example: ``` 'rubygems|server': {'id': '3748309', 'poster_id': '262808', 'date': '2010-09-20' ``` ## ``/``_``_code_count_list.json JSON file that contains a single list of library counts in the posts (in chronological order) that contain *at least one* library import. ## ``/``_daily_post_stats.json JSON file that counts the number of posts on a given day, listed chronologically, containing dates *with at least one post*. Dictionary of key=date value=count(int) pairs. Example: ```{... '2011-09-03': 6, '2011-09-04': 3, '2011-09-05': 10, '2011-09-06': 5, '2011-09-07': 15, ...} ``` ## ``/``_``_post_stats.json JSON file that lists the individual post metadata (sorted by post date). Fields: - post id, - post type, - list of imports - post date - poster id - score Example: ``` {'id': '1892176', 'post_type': '1', 'imports': ['mechanize', 'rubygems'], 'date': '2009-12-12T03:31:43.823', 'poster_id': '124685', 'score': '5'}, ``` ## ``/``_time_based_new.jsonl JSONL file that contains JSON objects (in chronological order) detailing post metadata. Fields: - post id, - post date - poster id (user id) - post type, - list of imports - list of novel libraries in post - list of novel pairs in post Example: ``` {'post_id': '3543', 'post_date': '2008-08-06T15:24:00.787', 'user_id': '399', 'post_type': '2', 'imports': ['metric_fetcher', 'rake'], 'new_libs': ['metric_fetcher', 'rake'], 'new_pairs': ['metric_fetcher|rake']} ``` ## ``/``_user_to_posts.json JSON file that lists the post ids corresponding to a given user id. Keyed by user ids, values are list of post ids. Example: ``` '303675': ['2941479'], '348325': ['2945141', '2956990', '2968924', '3832703'], '325477': ['2945228'], '27196': ['2949100', '3177217'], ```
创建时间:
2024-11-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作