.bam alignment files of Illumina and ONT sequencing of pREF plasmid

NIAID Data Ecosystem2026-05-01 收录

下载链接：

http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.k0p2ngffn

下载链接

链接失效反馈

官方服务：

资源简介：

The expression of genes encompasses their transcription into mRNA followed by translation into protein. In recent years, next-generation sequencing and mass spectrometry methods have profiled DNA, RNA and protein abundance in cells. However, there are currently no reference standards that are compatible across these genomic, transcriptomic and proteomic methods, and provide an integrated measure of gene expression. Here, we use synthetic biology principles to engineer a multi-omics control, termed pREF, that can act as a universal molecular standard for next-generation sequencing and mass spectrometry methods. The pREF sequence encodes 21 synthetic genes that can be in vitro transcribed into spike-in mRNA controls, and in vitro translated to generate matched protein controls. The synthetic genes provide qualitative controls that can measure sensitivity and quantitative accuracy of DNA, RNA and peptide detection. We demonstrate the use of pREF in metagenome DNA sequencing and RNA sequencing experiments and evaluate the quantification of proteins using mass spectrometry. Unlike previous spike-in controls, pREF can be independently propagated and the synthetic mRNA and protein controls can be sustainably prepared by recipient laboratories using common molecular biology techniques. Together, this provides the first universal synthetic standard able to integrate genomic, transcriptomic and proteomic methods. Methods Illumina DNA sequencing pREF. We first sequenced neat preparations of pREF. Four replicate libraries were prepared using the KAPA HyperPlus PCR-based kit (Illumina) according to the manufacturer’s instructions. Prepared libraries were quantified on a Qubit (Invitrogen) and verified on the Agilent 2100 Bioanalyzer with the Agilent High Sensitivity DNA Kit (Agilent Technologies). The libraries were then sequenced on a NovaSeq (Illumina). The sequencing was performed at the Kinghorn Centre for Clinical Genomics, Darlinghurst, New South Wales. ONT DNA sequencing pREF. pREF was linearised using restriction enzymes, and four replicate libraries were prepared for nanopore sequencing, with the LSK108 kit (1D ligation) according to the manufacturer’s instructions. The resulting libraries were sequenced on a PromethION instrument, at the Kinghorn Centre for Clinical Genomics, Darlinghurst, New South Wales. Base-calling was achieved using ONT Albacore Sequencing Pipeline Software (version 1.2.6). pREF alignment and kmer analysis. The four replicate Illumina short read DNA libraries were aligned to reference sequences containing the pREF plasmid using BWA-MEM2, while the two Illumina RNA libraries (transcribed with Sp6 and T7 polymerase) were aligned using bowtie2 (v2.4.0). Long read DNA and RNA libraries generated by Oxford Nanopore sequencing were aligned to the pREF reference sequence using MiniMap2 (v2.17-r941) with the parameters 'minimap2 -ax map-ont' optimized for Oxford Nanopore libraries. Alignment files were sorted and indexed using samtools (v1.9) and pysamstats were used to retrieve the coverage and specific error types, such as mismatches or insertions and deletions, for every reference sequence position.

创建时间：

2024-02-05