five

Replication Data for: \"Images of the arXiv: reconfiguring large scientific image datasets\"

收藏
DataONE2021-03-03 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:03d951692a468ae07c51a55a445cb3cec54443586288f89e7e03f7ce6f690f88
下载链接
链接失效反馈
官方服务:
资源简介:
Code and data for interacting with the article and image metadata from papers stored in the arXiv repository. At the time of writing the related publication, this involved investigating ~1.5 million articles, ~10 million images, and 2.1 TB of data downloaded from arXiv. This dataset upload contains instructions and code to download the bulk source data, extract into a folder hierarchy, create an SQLite database, and then run various queries, sample images, and generate plots using this data. The full SQLite database is provided containing article metadata, image metadata, and figure captions. Also contained here are data statistics and image credits for images that appear in the related publication.
创建时间:
2023-11-19
二维码
社区交流群
二维码
科研交流群
商业服务