METHYLATION SITE ANALYSIS USING methylKit
收藏Figshare2025-05-21 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/METHYLATION_SITE_ANALYSIS_USING_methylKit/29120102
下载链接
链接失效反馈官方服务:
资源简介:
SUMMARYThis script processes bisulfite sequencing data to identify significantly methylated CpG sitesacross multiple developmental timepoints and conditions in Nasonia vitripennis. It performs:- Read filtering- Merging methylation data- Binomial testing per sample- FDR correction- Site selection- Methylation profile visualizationORIGINScript developed using `methylKit` for differential methylation analysis on whole-genome bisulfitedata from the Erin Diapause project.KEY STEPS1. Read methylation data (`*.cov` files) using `methRead()`2. Filter CpGs by minimum and maximum coverage thresholds3. Merge all samples into a common methylation object using `unite()`4. Conduct one-tailed binomial tests (p > background) per CpG site for each sample5. Apply Benjamini-Hochberg FDR correction per sample6. Filter for significant sites (FDR 7. Collate significant sites across all samples8. Extract a methylBase object from the merged sites9. Export CpG-level data to `Total_Methylated_Bases.txt`10. Generate correlation, clustering, and PCA plots across all retained samplesINPUT FILES- `*.CpG_report.merged_CpG_evidence.cov`: Methylation files from Bismark (one per sample)SAMPLES- 40 samples representing combinations of treatment (Control/Diapause) and timepoint (6d to 30d), with replicates- Sample names: D6C1, D6D1, D12C1, ..., D6C4- Treatment vector: `treat_conditions` defines relative orderOUTPUT FILES- `step1meth.RData`, `step2meth.RData`, `step3meth.RData`, `finalmeth.RData`: R environments at major steps- `Total_Methylated_Bases.txt`: Final table of significant methylated sites across all samples- `CpG_Correlation.pdf`: Pairwise methylation correlation matrix- `CpG_Cluster.pdf`: Hierarchical clustering dendrogram- `CpG_PCA.pdf`: PCA plot of CpG methylation patternsSOFTWARE REQUIREMENTS- R package: `methylKit` (v1.22.1+ recommended)- Additional packages: `stats`, `utils`, `grDevices` (base R)NOTES- Filtering is performed with `mincov=10` and high-coverage threshold at 99.9th percentile- Binomial tests are one-tailed (test for greater-than-background methylation)- Each sample uses a slightly adjusted p-value threshold (e.g., 0.004–0.005)- Data are not destranded or normalized at this stageLIMITATIONS- Statistical tests are applied per sample; no groupwise differential analysis is performed here- Only CpGs passing FDR - Manual sample-by-sample processing of 40 timepoints; automation is possible for scalabilityCONTACTEamonn Mallon ebm3@le.ac.uk
创建时间:
2025-05-21



