five

Data for: Ensemble Learning: Predicting Human Pathogenicity of Hematophagous Arthropod Vector-Borne Viruses

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/hvk9f2by6k
下载链接
链接失效反馈
官方服务:
资源简介:
Overview: This dataset supports the study on predicting the human pathogenicity of viruses carried by blood-feeding arthropods (mosquitoes, ticks, etc.) using ensemble learning. It integrates large-scale epidemiological data with genomic functional annotations to assess zoonotic spillover risks.   Dataset Components: Epidemiological Characteristics Dataset: Covers 294 viruses and 37 distinct features categorized into: Virus Properties: Baltimore classification and taxonomy.   Vector Host Features: Family and genus of vectors (e.g., Culicidae, Ixodidae).   Non-vector Host Diversity: Distribution across 15 groups, emphasizing the impact of Perissodactyla and Carnivora orders on pathogenicity.   Viral Sequence Pathogenic Function Dataset: Includes functional annotations for 71,623 viral sequences. Using SeqScreen, 10 key Functional Signatures of Concern (FunSoCs) were identified, such as:   Viral Adhesion: Found in 62% of sequences, crucial for host cell entry.   Host Xenophagy & Viral Counter Signaling: Key features for immune evasion.   Viral Invasion: Associated with non-pathogenic traits in this specific context.   Technical Application: The data were utilized to develop and validate XGBoost-based models: Regression Model: Achieved an R² of 90.6%, correlating host diversity with pathogenicity.   Classification Model: Achieved an F1 score of 96.79% for identifying pathogenic potential at the sequence level.   External Validation: Includes predictions for 228 sequences, highlighting potential risks from Palma and Zaliv Terpeniya viruses.   Research Value: This resource allows for the strain-level prediction of pathogenicity within metagenomic data, providing a robust framework for early warning systems of emerging zoonotic threats.
创建时间:
2026-01-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作