five

METHYLATION SITE ANALYSIS USING methylKit

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/METHYLATION_SITE_ANALYSIS_USING_methylKit/29120102
下载链接
链接失效反馈
官方服务:
资源简介:
SUMMARY This script processes bisulfite sequencing data to identify significantly methylated CpG sites across multiple developmental timepoints and conditions in Nasonia vitripennis. It performs: - Read filtering - Merging methylation data - Binomial testing per sample - FDR correction - Site selection - Methylation profile visualization ORIGIN Script developed using `methylKit` for differential methylation analysis on whole-genome bisulfite data from the Erin Diapause project. KEY STEPS 1. Read methylation data (`*.cov` files) using `methRead()` 2. Filter CpGs by minimum and maximum coverage thresholds 3. Merge all samples into a common methylation object using `unite()` 4. Conduct one-tailed binomial tests (p > background) per CpG site for each sample 5. Apply Benjamini-Hochberg FDR correction per sample 6. Filter for significant sites (FDR < 0.05) 7. Collate significant sites across all samples 8. Extract a methylBase object from the merged sites 9. Export CpG-level data to `Total_Methylated_Bases.txt` 10. Generate correlation, clustering, and PCA plots across all retained samples INPUT FILES - `*.CpG_report.merged_CpG_evidence.cov`: Methylation files from Bismark (one per sample) SAMPLES - 40 samples representing combinations of treatment (Control/Diapause) and timepoint (6d to 30d), with replicates - Sample names: D6C1, D6D1, D12C1, ..., D6C4 - Treatment vector: `treat_conditions` defines relative order OUTPUT FILES - `step1meth.RData`, `step2meth.RData`, `step3meth.RData`, `finalmeth.RData`: R environments at major steps - `Total_Methylated_Bases.txt`: Final table of significant methylated sites across all samples - `CpG_Correlation.pdf`: Pairwise methylation correlation matrix - `CpG_Cluster.pdf`: Hierarchical clustering dendrogram - `CpG_PCA.pdf`: PCA plot of CpG methylation patterns SOFTWARE REQUIREMENTS - R package: `methylKit` (v1.22.1+ recommended) - Additional packages: `stats`, `utils`, `grDevices` (base R) NOTES - Filtering is performed with `mincov=10` and high-coverage threshold at 99.9th percentile - Binomial tests are one-tailed (test for greater-than-background methylation) - Each sample uses a slightly adjusted p-value threshold (e.g., 0.004–0.005) - Data are not destranded or normalized at this stage LIMITATIONS - Statistical tests are applied per sample; no groupwise differential analysis is performed here - Only CpGs passing FDR < 0.05 are retained - Manual sample-by-sample processing of 40 timepoints; automation is possible for scalability CONTACT Eamonn Mallon ebm3@le.ac.uk
创建时间:
2025-05-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作