five

Dataset relating to the study "Open government data: usage trends and metadata quality"

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4054742
下载链接
链接失效反馈
官方服务:
资源简介:
Open Government Data (OGD) has the potential to support social and economic progress. However, this potential can be frustrated if this data remains unused. Although the literature suggests that OGD datasets' metadata quality is one of the main factors affecting their use, to the best of our knowledge, no quantitative study provided evidence of this relationship. Considering about 400,000 datasets of 28 national, municipal, and international OGD portals, we have programmatically analyzed their usage, their metadata quality, and the relationship between the two. Our analysis has highlighted three main findings. First of all, regardless of their size, the software platform adopted, and their administrative and territorial coverage, most OGD datasets are underutilized. Second, OGD portals pay varying attention to the quality of their datasets’ metadata. Third, we did not find clear evidence that datasets usage is positively correlated to better metadata publishing practices. Finally, we have considered other factors, such as datasets’ category, and some demographic characteristics of the OGD portals, and analyzed their relationship with datasets usage, obtaining partially affirmative answers. The dataset consists of three zipped CSV files, containing the collected datasets' usage data, full metadata, and computed quality values, for about 400,000 datasets belonging to the 8 national, 4 international, and 16 US municipalities OGD portals considered in the study. Data collection occurred in the period:   2019-12-19  --   2019-12-23. ________________________________________ Portal                                #Datasets   Platform      ________________________________________  US                                      261,514        CKAN France                                  39,412        Other Colombia                                9,795        Socrata IE                                            9,598       CKAN Slovenia                                 4,892        CKAN Poland                                    1,032        Other Latvia                                        336        CKAN Puerto Rico                               178        Socrata   New York, NY                         2,771      Socrata Baltimore, MD                        2,617       Socrata Austin, TX                              2,353       Socrata Chicago, IL                            1,368        Socrata San Francisco, CA                1,001        Socrata Dallas, TX                             1,001        Socrata Los Angeles, CA                     943         Socrata Seattle, WA                             718         Socrata Providence, RI                        288         Socrata Honolulu, HI                            244         Socrata New Orleans, LA                     215         Socrata Buffalo, NY                              213         Socrata Nashville, TN                          172          Socrata Boston, MA                             170          CKAN Albuquerque, NM                     60          CKAN Albany, NY                               50           Socrata   HDX                                  17,325           CKAN EUODP                             14,058           CKAN NASA                                  9,664           Socrata World Bank Finances         2,177           Socrata ________________________________________   The three datasets share the same table structure: Table Fields portalid: portal identifier id: dataset identifier engine: identifier of the supporting portal platform: 1(CKAN), 2 (Socrata) admindomain: 1 (National), 2 (US), 3 (International) downloaddate: date of data collection views: number of total views for the dataset downloads: number of total downloads for the dataset  overallq: overall quality values computed by applying the methodology presented by Neumaier et al. in [1] qvalues:  json object containing the quality values computed for the 17 metrics presented in by Neumaier et al. [1] assessdate: date of quality assessment metadata: the overall dataset's metadata downloaded via API from the portal according to the supporting platform schema [1] Neumaier, S.; Umbrich, J.; Polleres, A. Automated Quality Assessment of Metadata Across Open Data Portals.J. Data and Information Quality2016,8, 2:1–2:29. doi:10.1145/2964909
创建时间:
2021-10-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作