five

Dataset_1

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://data.mendeley.com/datasets/df8w8dct3b
下载链接
链接失效反馈
官方服务:
资源简介:
Dataset_1 provides seven FASTA files corresponding to protein databases. The composite database, named “All_Databases_5950827_sequences.fasta” contains protein sequences retrieved from public databases related to cephalopods salivary glands and proteins identified from our original data. This database comprises a total of 5,950,827 protein sequences and in turn it is composed by six smaller databases, named with capital letters from A to F: Database_A_19087_sequences.fasta, Database_B_16990_sequences.fasta, Database_C_2427_sequences.fasta, Database_D_84778_sequences.fasta, Database_E_5106635_sequences.fasta, Database_F_720910_sequences.fasta. Each one of these databases, contains data from several sources, i.e.: Database_A_19087_sequences.fasta – protein database from proteogenomic analyses of O. vulgaris salivary apparatus, built by Fingerhut et al. (2018); Database_B_16990_sequences.fasta – antimicrobial peptides from a non-redundant database collected by Aguilera-Mendoza et al. (2015); Database_C_2427_sequences.fasta – proteins identified with Proteome Discoverer using our 12 LTQ raw files against the UniProt database for the Metazoa taxonomic selection (2018_07 release); Database_D_84778_sequences.fasta and Database_E_5106635_sequences.fasta – proteins identified, from de novo transcriptome assemblies of 16 cephalopods posterior salivary glands, by TransDecoder and six-frame translation tool, respectively; Database_F_720910_sequences.fasta – proteins obtained by six-frame translation tool using the transcripts profiled in the transcriptome of O. vulgaris, but not included by the authors in Database_A_19087_sequences.fasta.
创建时间:
2021-03-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作