Recombinant Read Extraction Pipeline Test Input File
收藏DataCite Commons2024-12-05 更新2025-04-19 收录
下载链接:
https://figshare.com/articles/dataset/Recombinant_Read_Extraction_Pipeline_Test_Input_File/27967968/1
下载链接
链接失效反馈官方服务:
资源简介:
Recombinant Read Extraction Pipeline with Test Input Data<b>Description:</b>This dataset showcases the <b>Recombinant Read Extraction Pipeline</b>, previously developed by us (https://doi.org/10.6084/m9.figshare.26582380), designed for the detection of recombination events in sequencing data. The pipeline enables the alignment of sequence reads to a reference genome, generation of SNP strings, identification of haplotypes, extraction of recombinant sequences, and comprehensive result compilation into an Excel summary for seamless analysis.<b>Included in this dataset:</b><code><strong>config.json</strong></code>: Configuration file with default settings.<code><strong>pipeline_test_reads.fa</strong></code>: A test FASTA file containing simulated recombination and allele replacement events, specifically:<b>Two recombination events</b> each covered by <b>15 reads</b>, transitioning between <i>Solanum lycopersicum</i> cv. <b>Moneyberg</b> and <b>Moneymaker</b> haplotypes.<b>One recombination event</b> covered by <b>20 reads</b>, involving a switch at the extremity of the amplicon analysed from <b>Moneymaker</b> to <b>Moneyberg</b> haplotype.<b>One allele replacement event</b> covered by <b>20 reads</b>, featuring recombination from <b>Moneymaker</b> to <b>Moneyberg</b> and back to <b>Moneymaker</b>.Wild-type <i>Solanum lycopersicum</i> cv. <b>Moneyberg</b> and <b>Moneymaker</b> sequences.<code><strong>final_output.xlsx</strong></code>: Example output summarizing read names, sequences, and read counts.<b>Usage Instructions:</b><b>Install Dependencies:</b> Follow the installation guidelines to set up required software and Python libraries (please refer to <i>https://doi.org/10.6084/m9.figshare.26582380</i>).<b>Configure Pipeline:</b> Customize parameters in <code>config.json</code> as needed.<b>Run Pipeline:</b> Execute the pipeline using the provided script to process the test input file.<b>Review Outputs:</b> Examine <code>final_output.xlsx</code> to verify the detection and summarization of recombinant events.The dataset <code><strong>pipeline_test_reads.fa</strong></code> serves as a control dataset designed to verify the functionality of the <b>Recombinant Read Extraction Pipeline</b> previously described (https://doi.org/10.6084/m9.figshare.26582380). This dataset contains artificially generated "reads" and does not include any genuine DNA sequencing data.<b>Keywords:</b> Genomic Data Processing, Recombinant Detection, Haplotype Analysis, Bioinformatics Pipeline, SNP Analysis
提供机构:
figshare
创建时间:
2024-12-05



