five

Additional file 1 of An analysis of proteogenomics and how and when transcriptome-informed reduction of protein databases can enhance eukaryotic proteomics

收藏
DataCite Commons2022-06-21 更新2024-07-29 收录
下载链接:
https://springernature.figshare.com/articles/dataset/Additional_file_1_of_An_analysis_of_proteogenomics_and_how_and_when_transcriptome-informed_reduction_of_protein_databases_can_enhance_eukaryotic_proteomics/20106526
下载链接
链接失效反馈
官方服务:
资源简介:
Additional file 1: Table S1. Detailed sample description. For each sample it is reported the depths of their proteome and transcriptome datasets, the technology used to generate them, data availability on public repositories and the reference (PMID: Pubmed ID). Table S2. Comparison of PSM scores obtained for the same spectrum in the full database and in the reduced database search, using Mascot search engine. The table shows the number of reallocated spectra whose score in the reduced database search is equal, lower or higher to that in the full database. The score from searching the reduced database is never observed to be higher than the score from the full database, and in particular, not for reallocations on targets in the reduced database. Table S3. Comparison of PSM scores obtained for the same spectrum in the full database and in the reduced database search, using MS-GF+ search engine. The table shows the number of reallocated spectra whose score in the reduced database search is equal, lower otr higher to that in the full database. The score of reallocations on targets in the reduced database search is never higher than in the full database. Table S4. Score cutoffs obtained by target-decoy competition for FDR control at 1% for the full (reference Ensembl database) or reduced (transcriptome-informed reduced database) database searches. Database searches were performed using the Mascot search engine. Table S5. Score cutoffs obtained by target-decoy competition for FDR control at 1% for the full (reference Ensembl database) or reduced (transcriptome-informed reduced database) database searches. Database searches were performed using the MS-GF+ search engine. Table S6. Reallocations which can generate an additional identification in the reduced DB search. Table S7. Additional peptide identifications and corresponding protein identifications. Table S8. Number of spectra or number of spectra identifying additional peptides exclusively identified in the reduced database search due to: i. lower score cutoff at 1% FDR control in the reduced database search compared to the full database; ii. pure reallocation. The former are additional identifications from PSMs only passing the cutoff from the reduced database search and which would not be accepted based on the full database cutoff. It includes cases of identical PSMs in both searches (“no reallocation”) and cases of reallocation from decoy (“decoy_target”), target (“target_target*”) or no match (“no match_target”) in the full database search to target matches in the reduced database. Additional identifications from pure reallocation, instead, are those exclusively originated by reallocation, which would also pass the full database cutoff (i.e., independent from the lower score cutoff effect). Table S9. Number of valid targets and decoys from the full or reduced database obtained at 1% FDR using the cutoffs estimated by TDC on the respective database search results (first and last rows). The second row instead simulates the number of valid targets and decoys which would be obtained from the reduced database if the estimated cutoff were the same as for the full database. The associated nominal FDR level is reported (calculated as (d+1)/t, with d and t being the number of valid decoys and targets). Table S10. Match in the reduced database search for spectra matching valid targets or valid decoys in the full database. Table S11. Score cutoffs obtained by TDC or by BH procedure for FDR control for the full or reduced database searches at various FDR levels (0.5%, 1% and 5%). Table S12. Protein-to-gene ratio in multi-protein CCs. Table S13. Description of the pipeline for transcriptome generation and analysis. Table S14. Description of the pipeline for proteome generation and analysis.
提供机构:
figshare
创建时间:
2022-06-21
二维码
社区交流群
二维码
科研交流群
商业服务