Recombinant Read Extraction Pipeline Test Input File
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Recombinant_Read_Extraction_Pipeline_Test_Input_File/27967968
下载链接
链接失效反馈官方服务:
资源简介:
Recombinant Read Extraction Pipeline with Test Input Data
Description:
This dataset showcases the Recombinant Read Extraction Pipeline, previously developed by us (https://doi.org/10.6084/m9.figshare.26582380), designed for the detection of recombination events in sequencing data. The pipeline enables the alignment of sequence reads to a reference genome, generation of SNP strings, identification of haplotypes, extraction of recombinant sequences, and comprehensive result compilation into an Excel summary for seamless analysis.
Included in this dataset:
config.json: Configuration file with default settings.pipeline_test_reads.fa: A test FASTA file containing simulated recombination and allele replacement events, specifically:Two recombination events each covered by 15 reads, transitioning between Solanum lycopersicum cv. Moneyberg and Moneymaker haplotypes.One recombination event covered by 20 reads, involving a switch at the extremity of the amplicon analysed from Moneymaker to Moneyberg haplotype.One allele replacement event covered by 20 reads, featuring recombination from Moneymaker to Moneyberg and back to Moneymaker.Wild-type Solanum lycopersicum cv. Moneyberg and Moneymaker sequences.final_output.xlsx: Example output summarizing read names, sequences, and read counts.Usage Instructions:Install Dependencies: Follow the installation guidelines to set up required software and Python libraries (please refer to https://doi.org/10.6084/m9.figshare.26582380).Configure Pipeline: Customize parameters in config.json as needed.Run Pipeline: Execute the pipeline using the provided script to process the test input file.Review Outputs: Examine final_output.xlsx to verify the detection and summarization of recombinant events.The dataset pipeline_test_reads.fa serves as a control dataset designed to verify the functionality of the Recombinant Read Extraction Pipeline previously described (https://doi.org/10.6084/m9.figshare.26582380). This dataset contains artificially generated "reads" and does not include any genuine DNA sequencing data.
Keywords: Genomic Data Processing, Recombinant Detection, Haplotype Analysis, Bioinformatics Pipeline, SNP Analysis
创建时间:
2024-12-05



