Recombinant Read Extraction Pipeline Test Input File

Name: Recombinant Read Extraction Pipeline Test Input File
Creator: figshare
Published: 2024-12-05 11:44:48
License: 暂无描述

DataCite Commons2024-12-05 更新2025-04-19 收录

下载链接：

https://figshare.com/articles/dataset/Recombinant_Read_Extraction_Pipeline_Test_Input_File/27967968/1

下载链接

链接失效反馈

官方服务：

资源简介：

Recombinant Read Extraction Pipeline with Test Input DataDescription:This dataset showcases the Recombinant Read Extraction Pipeline, previously developed by us (https://doi.org/10.6084/m9.figshare.26582380), designed for the detection of recombination events in sequencing data. The pipeline enables the alignment of sequence reads to a reference genome, generation of SNP strings, identification of haplotypes, extraction of recombinant sequences, and comprehensive result compilation into an Excel summary for seamless analysis.Included in this dataset:<code>config.json</code>: Configuration file with default settings.<code>pipeline_test_reads.fa</code>: A test FASTA file containing simulated recombination and allele replacement events, specifically:Two recombination events each covered by 15 reads, transitioning between Solanum lycopersicum cv. Moneyberg and Moneymaker haplotypes.One recombination event covered by 20 reads, involving a switch at the extremity of the amplicon analysed from Moneymaker to Moneyberg haplotype.One allele replacement event covered by 20 reads, featuring recombination from Moneymaker to Moneyberg and back to Moneymaker.Wild-type Solanum lycopersicum cv. Moneyberg and Moneymaker sequences.<code>final_output.xlsx</code>: Example output summarizing read names, sequences, and read counts.Usage Instructions:Install Dependencies: Follow the installation guidelines to set up required software and Python libraries (please refer to https://doi.org/10.6084/m9.figshare.26582380).Configure Pipeline: Customize parameters in <code>config.json</code> as needed.Run Pipeline: Execute the pipeline using the provided script to process the test input file.Review Outputs: Examine <code>final_output.xlsx</code> to verify the detection and summarization of recombinant events.The dataset <code>pipeline_test_reads.fa</code> serves as a control dataset designed to verify the functionality of the Recombinant Read Extraction Pipeline previously described (https://doi.org/10.6084/m9.figshare.26582380). This dataset contains artificially generated "reads" and does not include any genuine DNA sequencing data.Keywords: Genomic Data Processing, Recombinant Detection, Haplotype Analysis, Bioinformatics Pipeline, SNP Analysis

提供机构：

figshare

创建时间：

2024-12-05

5,000+

优质数据集

54 个

任务类型

进入经典数据集