PUMA pipeline output
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/4545741
下载链接
链接失效反馈官方服务:
资源简介:
Output of the PUMA (PUblications Metadata Augmentation) software pipeline which takes a list of journal articles and augments it with metadata from external sources. This augmented metadata is then processed to generate data files and an explorable/searchable set of HTML pages.
The PUMA pipeline is available at: https://github.com/OllyButters/puma and is described at: https://doi.org/10.12688/f1000research.25484.1
These attached files are the result of running the pipeline on the list of publications described at: https://doi.org/10.12688/wellcomeopenres.14986.1 on 2021-01-15. Rerunning the pipeline on this list may result in slightly different outputs due to the changing content of the external metadata sources.
Screenshots of the output HTML pages:
PUMA_home_2021-01-15.png - Summary of all publications.
PUMA_2011_2021-01-15.png - All publications from 2011.
PUMA_map_2021-01-15.png - Choropleth map of first author's country.
PUMA_asthma_2021-01-15.png - All publications with an asthma MeSH.
PUMA_metrics_2021-01-15.png - Simple metrics.
PUMA_word_cloud_2021-01-15.png - Word cloud of abstract text.
PUMA_coverage_2021-01-15.png - Table showing completeness of metadata.
Generated data files
authors.csv - Frequency of authors.
first_authors.csv - Frequency of first authors.
first_authors_inst.csv - Frequency of first authors' institutes.
journals.csv - Frequency of journals published in.
abstract_lemmatized.csv - Frequency of lemmatized abstract words.
abstract_lemmatized_by_year.csv - Frequency of lemmatized abstract words broken down by year.
title_lemmatized.csv - Frequency of lemmatized title words.
title_lemmatized_by_year.csv - Frequency of lemmatized title words broken down by year.
keywords_lemmatized.csv - Frequency of lemmatized keywords.
keywords_lemmatized_by_year.csv - Frequency of lemmatized keywords broken down by year.
创建时间:
2024-07-19



