Appearance of Sociologická encyklopedie and its printed original works on Czech Wikipedia
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/3951305
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains data about references to the Encyclopedia of Sociology(Sociologická encyklopedie, https://encyklopedie.soc.cas.cz/, in Czech language only) from articles on the Czech Wikipedia in the period from 2020-03-12 to 2020-08-31. The data was harvested periodically for the purpose of further analysis.
This dataset was produced in the research project leading to the masters thesis.
Rožek, Š. A Verified Knowledge Source and Wikipedia. [Master’s Thesis] Charles University, Faculty of Arts, Institute of Information Studies and Librarianship : Prague CZ, 2020.
Further description of the thesis (in Czech) can be found on this website: https://is.cuni.cz/studium/dipl_st/index.php?do=main&doo=detail&did=211426
This dataset consists of a single ZIP archive that contains separate folders for each day of the measuring process. Every folder contains files with backlinks from Czech Wikipedia to Czech Sociological Encyclopedia, appearance of its Czech title on Wikipedia and also appearance of titles of printed original works, which this online Encyclopedia contains. Namely every folder contains JSON and CSV files with following measured aspects:
Backlinks from Czech Wikipedia to Czech sociological Encyclopedia (the ext_usage.{json|csv} and ext_usage_http.{json|csv} files)
Appearance of text string "Sociologická encyklopedie" on Czech Wikipedia
Appearance of text string "Velký sociologický slovník" on Czech Wikipedia
Appearance of text string "Malý sociologický slovník" on Czech Wikipedia
Appearance of text string "Slovník českých sociologů" on Czech Wikipedia
Appearance of text string "Slovník sociologického zázemí české sociologie" on Czech Wikipedia
The JSON files contain the raw responses of the MediaWiki API. The CSV files contain the same information presented in tabular form.
The backlinks CSV files have the following columns:
(blank): record number
page_id: the ID of the page where the link was found
title: the title of the page
url: the URL of the page
The text string appearance CSV files have the following columns:
(blank): record number
article_titles: the title of the Wikipedia article where the search term appeared
article_id: the ID of the Wikipedia article where the search term appeared
article_text_mention: a snippet of the article text where the search term appeared
The files were collected on a virtual server operated in the Metacentrum National grid infrastructure managed by the CESNET e-Infrastructure by the script sgenc_wiki_ref.R published in the GitHub repository at https://github.com/rozek1app2/sgenc-wiki-ref .
创建时间:
2020-08-31



