Concatenated analyses_partitioned

Name: Concatenated analyses_partitioned
Creator: figshare
Published: 2024-03-31 00:27:28
License: 暂无描述

DataCite Commons2024-03-31 更新2024-08-18 收录

下载链接：

https://figshare.com/articles/dataset/Concatenated_analyses_partitioned/24585897

下载链接

链接失效反馈

官方服务：

资源简介：

The demultiplexed FASTQ data were cleaned and trimmed of adapters using Illumiprocessor v.2.0 (Faircloth, 2013), based on the package Trimmomatic (Bolger et al., 2014). Data processing was done through a series of scripts available in the PHYLUCE package v.1.7.1 (Faircloth, 2015). Trimmed reads were assembled into contigs using a wrapper script (phyluce_assembly_assemblo_trinity.py) and the program TRINITY (version trinityrnaseq_r20140717) (Grabherr et al., 2011). We used the PHYLUCE pipeline to identify and extract contigs containing UCE loci. Species-specific contig assemblies were aligned to a FASTA file of all enrichment baits using phyluce_assembly_match_contigs_to_probes.py (min_coverage=50, min_identity=80). A list of UCE loci shared across all taxa was generated by using phyluce_assembly_get_match_counts.py. This list was then used to create FASTA files for each UCE locus using phyluce_get_fastas_from_match_counts.py. All sequence data in these FASTA files were aligned using MAFFT (Katoh and Standley, 2013) through phyluce_seqcap_align.py (min. length =100, no trim) and trimmed using a wrapper script (get_gblocks_trimmed_alignment_from_untrimmed.py) for Gblocks (Castresana, 2000) with the following settings: b1=0.5, b2=0.5, b3=12, b4=7. After trimming, multiple subsets based on filtering UCE loci for different levels of taxon occupancy (70%, 80% and 90% taxon completeness) were created using phyluce_get_only_loci_with_min_taxa.py, and we generated statistics across all subsets using get_align_summary_data.py. Individual alignments of UCE loci for each subset were then concatenated into one nexus alignment file with phyluce_align_format_nexus_files_for_raxml.py script for subsequent phylogenetic analyses. SPRUCEUP v2020.2.19 (Borowiec, 2019) was used to remove poorly aligned sequences or sequence fragments. The matrices were trimmed based on the following cut-off values: 95%, 97%, 98% and 99%. For this study, all the analyses here are based on 97% and 98% cut-off values, as a 95% cut-off was too stringent, and a 99% cut-off did not trim outlier sequences sufficiently.

提供机构：

figshare

创建时间：

2023-11-27

5,000+

优质数据集

54 个

任务类型

进入经典数据集