SDFA: A standardized decomposition format based framework for efficient and robust analyses of structural variants in population genomic studies

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/13293671

下载链接

链接失效反馈

官方服务：

资源简介：

We initially collected 10 NCBI individuals: HG002 family pedigree data (HG002 [son], HG003 [father], HG004 [mother]), the HG005 family pedigree data (HG005 [son], HG006 [father], HG007 [mother]), the NA12878 subject, the HG00096 subject, the HG00512 subject and the CHM13 subject. Then we used PacBio (CLR: Continuous Long Read, CCS: Circular Consensus Sequencing) and Nanopore (ONT) platforms, 5 aligners and 10 callers to construct the pipelines, with most parameters set to default values. After that, except for 6 invalid pipelines(pbmm2-Nanovar, lra-Picky, lra-delly, lra-NanoVar, lra-NanoSV, lra-pbsv), we obtain 1100 VCF files.

本研究首先收集了10个NCBI来源的个体样本：包括HG002家系（HG002为子代、HG003为父亲、HG004为母亲）、HG005家系（HG005为子代、HG006为父亲、HG007为母亲）、NA12878受试者、HG00096受试者、HG00512受试者以及CHM13受试者。随后，我们采用PacBio（CLR：连续长读长测序，Continuous Long Read；CCS：环形共识测序，Circular Consensus Sequencing）与Nanopore（ONT）测序平台，结合5款序列比对工具（aligner）与10款变异调用工具（caller）构建分析流程，绝大多数参数均采用默认设置。此后，在排除6个无效分析流程（pbmm2-Nanovar、lra-Picky、lra-delly、lra-NanoVar、lra-NanoSV、lra-pbsv）后，最终获得1100个VCF格式文件。

创建时间：

2024-08-11

5,000+

优质数据集

54 个

任务类型

进入经典数据集