five

Low-coverage whole genome raw sequence reads from two ostariophysan lineages.. Ostariophysans

收藏
NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJNA493643
下载链接
链接失效反馈
官方服务:
资源简介:
We purchased specimens of Apteronotus albifrons and Corydoras paleatus from commercial wholesalers in the Los Angeles, CA area, collected tissues following protocols approved by the University of California Los Angeles Institutional Animal Care and Use Committee (Approval 2008-176-21), and extracted DNA from each tissue using a commercial kit following the manufacturer’s instructions (DNeasy, Qiagen N.V.). After extraction, we quantified 2 μL of DNA using a Qubit fluorometer (Invitrogen, Inc.) following the manufacturer's protocol, and we visualized 50-100 ng of each extract by electrophoresis through 1.5% (w/v) agarose gel in TBE or TAE. Following this quality check, we prepared 100 µL (~10 ng/μL) aliquots of extracted DNA, and we sheared each sample to 300-600 bp in length using 5-10 cycles of sonication (High; 30 s on; 90 s off) on a BioRuptor (Diagenode, Inc.). We prepared single-indexed sequencing libraries from 0.5-1.0 µg sheared DNA extracts using a commercial library preparation kit (Kapa Biosystems, Inc.) and a set of custom-indexed sequencing adapters (Faircloth and Glenn, 2012). Following library preparation, we size-selected the Corydoras library to span a range of 200-300 bp using agarose-gel-based size selection. We did not size-select the Apteronotus library. We amplified both libraries using 6-10 cycles of PCR, and we purified library amplifications using SPRI beads (Rohland and Reich, 2012). Following purification, we checked the insert size distribution of each library using a BioAnalyzer (Agilent, Inc.), and we quantified libraries using a commercial qPCR quantification kit (Kapa Biosystems, Inc.). We ran each library on a separate lane of Illumina, paired-end, 100 bp sequencing (PE100) by combining each library into a pool of unrelated (and differently indexed) samples, and we sequenced each library pool using an Illumina HiSeq 2500 at the UCLA Neuroscience Genomics Core (UNGC). We demultiplexed the sequencing data using bcl2fastq 1.8.4 and allowing one base pair mismatches between the expected and observed indexes (the index sequences we used were robust to ≤ 3 insertion, deletion, or substitution errors).We validated the species identification of each sample by aligning FASTQ reads to a related mtDNA genome using bwa mem v0.7.17 (Li, 2013), reducing the resulting BAM file to aligning reads using samtools v0.1.18 (Li et al., 2009), and converting the BAM file of aligned reads back to paired FASTQ reads using bedtools v2.17.0 (Quinlan and Hall, 2010). We assembled the resulting FASTQ data using spades v3.10.1 (Nurk et al., 2013) with read correction, a kmer length of 55, and the `--careful` assembly option. From the contig that resulted (which was either equal to or slightly shorter than the general mtDNA sequence length for vertebrates), we used a program within the phyluce package (Faircloth, 2015) to extract the portions of each contig that were similar to COI sequences from Apteronotus (NCBI GenBank AB054132.1:5453-7012) and Corydoras (NCBI GenBank JN988809.1). We then matched these extracted COI sequences against the Species Level Barcode Records in the BOLD Systems Database (http://www.boldsystems.org; search performed August 2018). For Apteronotus albifrons, the top hit (100% sequence identity) validated the species identification, and for Corydoras sp., the top publicly available hit (99.85% sequence identity) was Corydoras paleatus (NCBI GenBank JX111734.1; (Rosso et al., 2012)).
创建时间:
2018-09-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作