five

Curation and ISA representation of a SARS-Cov2/Covid-19 Proteomics Dataset - PXD107710 - ISA representation

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/3742218
下载链接
链接失效反馈
官方服务:
资源简介:
Curation and ISA representation of a SARS-Cov2/Covid-19 Proteomics Dataset deposited in PRIDE database with accession number: PXD107710 ISA-Tab annotation for the  "SARS-CoV-2 infected host cell proteomics reveal potential therapy targets" publication.  Github repository: https://github.com/ISA-tools/PXD017710 This is part of an effort to (re-)annotate: https://dx.doi.org/10.21203/rs.3.rs-17218/v1 Additional work done as part of:  https://github.com/virtual-biohackathons/covid-19-bh20  https://github.com/virtual-biohackathons/covid-19-bh20/wiki/FairData Proteomics data Available from PRIDE at https://www.ebi.ac.uk/pride/archive/projects/PXD017710 and [MassIVE/CCMS Maestro+MSstats reanalysis of MSV000085096 / PXD017710] ISA-Tab representation: Rationale: Demonstrate suitability of the ISA format for representing MS based protein profiling experiment with more granularity and details, thus providing a better representation of the experiment design. The formatting and re-annotation are based on information extracted from: - the original publication - the supplementary tables available from the publishers site - the 'filtered-results.csv' helper file as supplied to @sneumann during the HUPO-PSI meeting March 2020 Viewing the ISA-tab formatted and re-annotated PXD017710 with ISATab-Viewer Viewing the ISA-tab formatted and re-annotated PXD017710 locally, do the following: ```bash python -m http.server 8000 ``` Then point your browser to `http://0.0.0.0:8000/isaviewer-demo.html` Curation tasks performed: * initial structure of the study design in ISA format: * linkage of Proteome and Translatome data (supplementary material) to ISA assay tables (via Derived Data File) * processing the Proteome and Translatome data (supplementary material) with python pandas library to generate the following csv files:     - proteome_intensities_long_table_ggplot2.txt     - proteome_diffanal_ratio_pvalue_long_table_ggplot2.txt     - translatome_intensities_long_table_ggplot2.txt         - translatome_diffanal_ratio_pvalue_long_table_ggplot2          The files are `long table` corresponding to a `melt` on the Excel file originally generated by the users and can be readily loaded in R ggplot2 library for graphical representation.     The statistical relevant elements have been annotated with the STATO ontology and the tables comply with a Frictionless.io Data Package.     The jupyter notebook for the transformation is available. * conversion of raw data to mzML format: detailed in https://github.com/ISA-tools/PXD017710 install docker:  ```bash         >brew update         >brew install docker ``` sign in to docker ```bash         >docker start         >docker login ``` pull docker container for ProteoWizard: ```bash >docker pull chambm/pwiz-i-agree-to-the-vendor-licenses ``` :warning: be sure to sign-up and login to https://hub.docker.com/ in order to be able to reach https://hub.docker.com/r/chambm/pwiz-skyline-i-agree-to-the-vendor-licenses run the pwiz tool from the container over the raw data: ```bash  docker run -it --rm -e WINEDEBUG=-all -v /Users/Downloads/PXD017710/raw/:/data chambm/pwiz-skyline-i-agree-to-the-vendor-licenses wine msconvert /data/*.raw --mzML ``` * ontology markup for:     * declaration of independent variables as ISA Study Factors:{biological agent, dose, time point, replicate} ->OBI     * Taxonomic information (host cells and virus) -> NCBITaxonomy     * Cell line: CaCo-2 cells -> Cell Line Ontology     * Disease: Colon Cancer -> Human Phenotype Ontology     * MS specific aspect (TMT reagent, instrument ... ) -> PSI-MS     * Statistical Tests -> STATO Unresolved curatorial issues:  1. ambiguities related to Tandem Mass Tag labelling protocol     - the publication mentions TMT11 (see Figure 2 in https://www.researchsquare.com/article/rs-17218/v1)     - the information available from PRIDE mentions TMT6 (https://www.ebi.ac.uk/pride/archive/projects/PXD017710)     This may require another round of annotation on the TMT agents and fractions in the ISA a_assay representation  2. SARS-Cov2 isolate: no clear NCBI Taxonomic anchoring and unclear origin: -> the markup is made to the parent class (as of 06.04.2020) Release and packaging as a BDBAG: The tgz file associated with this upload has been producing using https://github.com/fair-research/bdbag. It contains several manifest files detailing metadata and data files, providing md5 and sha256 checksums. Github repository: https://github.com/ISA-tools/PXD017710
创建时间:
2020-04-07
二维码
社区交流群
二维码
科研交流群
商业服务