Linked Avian Influenza Epidemiological and Genomic Data for Epidemic Intelligence (2012-2021)
收藏Recherche Data Gouv France2024-01-01 更新2026-04-09 收录
下载链接:
https://entrepot.recherche.data.gouv.fr/citation?persistentId=doi:10.57745/JNA7N9
下载链接
链接失效反馈官方服务:
资源简介:
We release a new Avian Influenza epidemiological events dataset (from 2012 to 2021), in which the epidemiological events in EMPRES-i [1] are enriched with the genome sequence data of Avian Influenza cases, publicly provided by the Bacterial and Viral Bioinformatics Resource Center (BV-BRC) [2]. The association between EMPRES-i and BV-BRC is obtained through an automatic task, described in [3]. For this reason, the obtained dataset is the result of the "putatively" associated events between EMPRES-i and BV-BRC. This dataset contributes to the available resources in the field of Avian Influenza surveillance and epidemic intelligence. It can be useful for epidemiologists and computer scientists for studying AI transmission dynamics. This dataset is obtained by our publicly available source code on GitHub: https://github.com/arinik9/AIAGIS Here are the files composing this dataset: raw_intput_files.zip: Raw input files from EMPRES-i and BV-BRC. Note that the collected data is structured by nature, but it needs to be preprocessed and normalized for the purpose of high-quality data linkage. BVBRC_genome_events.csv: Raw genome events in BV-BRC. BVBRC_genome_sequences.csv: Raw genome sequences in BV-BRC. EMPRES-i_events.csv: Raw EMPRES-i events. result_files.zip: doc_events_empres-i_strategy=1-to-1.csv: Enriched EMPRES-i dataset with genetic information based on the 1-to-1 linking strategy (Section 5.2.1 in [3]). doc_events_empres-i_strategy=1-to-many.csv: Enriched EMPRES-i dataset with genetic information based on the 1-to-many linking strategy (Section 5.2.2 in [3]). data_files_for_missing_genetic_information.zip: Average isolate similarity scores for handling missing genetic information in EMPRES-i. These scores are meant to be used when we want to compute the isolate similarity between a pair of EMPRES-i events, but at least one of them has a missing genetic information (Section 6 in [3]). genome_sim_summary_by_country_and_genome_name.csv: It is used when only one of the two events has an isolate information (1st strategy in [3]). genome_sim_summary_by_genome_name.csv: It is used when only one of the two events has an isolate information (2nd strategy in [3]). genome_sim_summary_by_country.csv: It is used when none of the two events has an isolate information (3rd strategy in [3]). genome_sim_summary.csv: It is used when none of the two events has an isolate information (4th strategy in [3]). [1] FAO. 2021. EMPRES Global Animal Disease Information System (EMPRES-i). Accessed on 23 October 2024, http://empres-i.fao.org/empres-i, licence CC-BY-4.0. [2]Introducing the Bacterial and Viral Bioinformatics Resource Center (BV-BRC): a resource combining PATRIC, IRD and ViPR. Olson RD, Assaf R, Brettin T, Conrad N, Cucinell C, Davis JJ, Dempsey DM, Dickerman A, Dietrich EM, Kenyon RW, Kuscuoglu M, Lefkowitz EJ, Lu J, Machi D, Macken C, Mao C, Niewiadomska A, Nguyen M, Olsen GJ, Overbeek JC, Parrello B, Parrello V, Porter JS, Pusch GD, Shukla M, Singh I, Stewart L, Tan G, Thomas C, VanOeffelen M, Vonstein V, Wallace ZS, Warren AS, Wattam AR, Xia F, Yoo H, Zhang Y, Zmasek CM, Scheuermann RH, Stevens RL. Nucleic Acids Res. 2022 Nov 9:gkac1003. doi: 10.1093/nar/gkac1003 [3] Nejat Arınık, Roberto Interdonato, Mathieu Roche, Maguelonne Teisseire (2024). An Improved Avian Influenza Surveillance Dataset with Genome Sequences (submitted).
提供机构:
)
创建时间:
2024-01-01



