Scripts used for data analyzing in the manuscript: “Unraveling genetic load dynamics during biological invasion: insights from two invasive insect species”
收藏Recherche Data Gouv France2024-01-01 更新2026-04-09 收录
下载链接:
https://entrepot.recherche.data.gouv.fr/citation?persistentId=doi:10.57745/ESQFDB
下载链接
链接失效反馈官方服务:
资源简介:
The scripts were used for data analyzing in the manuscript: “Unraveling genetic load dynamics during biological invasion: insights from two invasive insect species” Authors: Eric Lombaert, Aurélie Blin, Barbara Porro, Thomas Guillemaud, Julio S. Bernal, Gary Chang, Natalia Kirichenko, Thomas W. Sappington, Stefan Toepfer and Emeline Deleury The scripts were used on pool-seq data. Raw reads have been deposited in Sequence Read Archive, National Center for Biotechnology Information, under BioProject PRJNA1079689: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1079689 The scripts allow (i) trimming and mapping of raw reads, (ii) calling of SNP polymorphisms across multiple populations, (iii) annotation of SNPs, (iv) polarisation of alleles, and (v) comparison of genetic load between different populations. The scripts were successfully executed on a Debian 9.13 system with a Linux kernel version 4.9.0-7-amd64. The processor used was an Intel(R) Xeon(R) CPU E7540 at 2.00GHz, featuring 24 physical cores and 48 threads across 4 sockets. The machine had 128 GB of RAM and utilized a 2 TB hard drive for storage. Running all the scripts with this configuration takes one to several weeks. The main software used by the various scripts are: FastQC v0.11.5 (Andrews, 2010), Trimmomatic v0.35 (Bolger et al., 2014), bwa-mem v0.7.15 (Li, 2013), SAMtools v1.15.1 (Li et al., 2009), freebayes v1.3.6 (Garrison & Marth, 2012), bcftools v1.13 (Danecek et al., 2021), SnpEff v5.0 (Cingolani et al., 2012), est-sfs v2.03 (Keightley & Jackson, 2018) and R v4.2.2 (R Core Team, 2021). All the provided scripts have been executed for Diabrotica virgifera virgifera. The numbering at the beginning of each script is intended to facilitate chronological use.
本脚本用于论文《解析生物入侵过程中的遗传负荷动态:来自两种入侵昆虫物种的启示》的数据分析,论文作者包括Eric Lombaert、Aurélie Blin、Barbara Porro、Thomas Guillemaud、Julio S. Bernal、Gary Chang、Natalia Kirichenko、Thomas W. Sappington、Stefan Toepfer及Emeline Deleury。本脚本针对混池测序(pool-seq)数据开发。原始测序读数已提交至美国国家生物技术信息中心(National Center for Biotechnology Information, NCBI)的序列读取档案(Sequence Read Archive, SRA),对应生物项目编号为PRJNA1079689,访问链接为https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1079689。该脚本可实现以下功能:(i) 原始测序读数的质量修剪与序列比对;(ii) 多种群间的单核苷酸多态性(Single Nucleotide Polymorphism, SNP)位点调用;(iii) SNP位点注释;(iv) 等位基因极化分析;(v) 不同种群间遗传负荷的比较分析。本脚本已在搭载Linux内核4.9.0-7-amd64的Debian 9.13系统上成功运行。所用处理器为Intel(R) Xeon(R) CPU E7540 @ 2.00GHz,具备4个物理插槽、24个物理核心与48个线程;该主机配备128 GB内存与2 TB硬盘存储空间。使用此配置运行全部脚本耗时约1至数周。各脚本主要依赖的软件包括:FastQC v0.11.5(Andrews, 2010)、Trimmomatic v0.35(Bolger等, 2014)、bwa-mem v0.7.15(Li, 2013)、SAMtools v1.15.1(Li等, 2009)、freebayes v1.3.6(Garrison与Marth, 2012)、bcftools v1.13(Danecek等, 2021)、SnpEff v5.0(Cingolani等, 2012)、est-sfs v2.03(Keightley与Jackson, 2018)及R v4.2.2(R核心团队, 2021)。所有提供的脚本均已针对西方玉米根萤叶甲(Diabrotica virgifera virgifera)完成测试运行。每个脚本开头的编号用于按时间顺序辅助使用,便于按流程执行脚本。
创建时间:
2024-01-01



