Data and scripts for "Longitudinal structural variant phylogenies in metastatic prostate cancer" Liu et al., 2026
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/2nhhdjx225
下载链接
链接失效反馈官方服务:
资源简介:
This study evaluates whether structural variants (SVs) capture key evolutionary and resistance signals in mCRPC. We analyzed paired longitudinal tumor WGS (pre-BAT and on-BAT) with matched normals from COMBAT subjects on bipolar androgen therapy (BAT). SVCFit estimated structural-variant cellular fraction (SVCF), clustered SVs by prevalence, and inferred phylogenies and clone proportions, showing branched evolution and therapy-associated clonal shifts: highly rearranged lineages contracted on BAT while SV-defined clones expanded.
This repository provides source data and workflows for "Longitudinal structural variant phylogenies define tumor evolution under therapeutic selection pressure in metastatic prostate cancer" (Liu et al., 2026).
1. Clinical Cohort Data (COMBAT)
-- annotsv: SV annotations via AnnotSV.
-- survivor: Per-subject consensus SV calls (Manta + GRIDSS via SURVIVOR) and SVCFit-annotated VCFs.
-- facet: CNV segments from FACETS.
-- svcfit/cluster_v3 (per-pair clustering/, COMBAT/SVCFit_output, tree/; log/): SVCF/CCF beds, DP-GMM clusters, trees, bootstraps.
-- circos: Circos panels for representative subjects.
-- samp_pur.csv, pair.bed, reference resources (cytoBand, ENCODE cCRE, GeneHancer, mCRPC genes, sv_gene_analyze).
- Raw sequencing and germline calls available from the corresponding author on request; not publicly shared due to consent restrictions.
2. Synthetic Benchmarking (VISOR_benchmark)
-- ground_truth/sv_beds: Per-clone BEDs of spiked SV breakpoints, types, zygosity.
-- input_data/beds, input_data/snp_vcfs: Haplotype/SV BEDs and germline het-SNP VCFs.
-- replicates (rep1–rep30, each exp1–exp5): SVTyper VCFs, GATK4 SNPs, FACETS CNVs, SVclone/ccube .RData.
-- depth: Read-depth probes at SV breakpoints across purities.
-- output, figures; scripts/genome_setup, scripts/pipeline, scripts/helper.
3. Longitudinal Phylogeny Benchmark (Phylogeny_benchmark)
-- input_data: hack, norm_short (normal/o_normal × clone × h1/h2), reference, resource.
-- bootstrap/S1 (c50p10–c50p80; b1–b4, exp1–exp5 with clustering/t1/t2/tree): SVCFit estimates, paired-CCF clustering, trees, clone proportions.
-- output/boots0_4; scripts/a2_coverage_correlation.
4. In Silico Validation (Prostate_mixture)
-- ground_truth: Truth SV VCFs from source tumors.
-- bootstrap (rep1–rep30; svtyp, svclone, facet, SNP, prostate_vcfs): Manta SVs, FACETS CNVs, GATK4 SNPs, SVclone outputs for 3-, 4-, 5-clone mixtures.
-- output; scripts/helper.
5. Figure_script: R Markdown notebooks (COMBAT, VISOR_benchmark, Phylogeny_benchmark, Prostate_mixture, Read_depth) reproducing manuscript figures. Per-section scripts/ provide SLURM pipelines for alignment, SV calling, CNV/SNP, SVCFit/SVclone.
创建时间:
2026-05-13



