five

Scripts for pairwise nucleotide identity graphs

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Scripts_for_pairwise_nucleotide_identity_graphs/28013771
下载链接
链接失效反馈
官方服务:
资源简介:
Design of the nemabiome assay and selection of target genes Publicly available known rDNA regions were assessed for their suitability to detect nematode clades containing key parasitic genera known to infect canines, humans and other animals. A curated database of relevant parasitic GIN rDNA regions was downloaded from NCBI’s GenBank (29.01.24) and aligned using MAFFT (Katoh & Standley, 2013) to be able to assess the interspecific nucleotide diversity of different rDNA regions for the parasitic nematode clades, I, III, IV and V (Smythe et al., 2019). To compare nucleotide identity between rDNA loci of GIN clades, sequence alignments were firstly separated into 18S and ITS1-to-ITS2 regions for each clade and then dereplicated so that only one sequence from each GIN species was represented. For each locus, gaps in alignments were removed using trimAL (Capella-Gutiérrez et al., 2009) with the parameters -resoverlap 0.5 and -seqoverlap 50. Next, a pairwise nucleotide distance matrix was built using a custom python script is available in this folder entitled 'distance'. Finally, for each distance matrix, pairwise and median nucleotide identities were displayed using violin and jitter plots in R studio (R Core Team, 2021) with the packages ggplot2 (Wickham, 2011), and dplyr (Yarberry, 2021). See 'Nemabiome_rDNA_combined_script' within this folder for how this was achieved.
创建时间:
2024-12-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作