five

METHYLATION SITE ANALYSIS USING methylKit

收藏
DataCite Commons2025-06-01 更新2025-09-08 收录
下载链接:
https://figshare.com/articles/dataset/METHYLATION_SITE_ANALYSIS_USING_methylKit/29120102/1
下载链接
链接失效反馈
官方服务:
资源简介:
SUMMARY<br>This script processes bisulfite sequencing data to identify significantly methylated CpG sitesacross multiple developmental timepoints and conditions in <i>Nasonia vitripennis</i>. It performs:- Read filtering- Merging methylation data- Binomial testing per sample- FDR correction- Site selection- Methylation profile visualization<br>ORIGIN<br>Script developed using `methylKit` for differential methylation analysis on whole-genome bisulfitedata from the Erin Diapause project.<br>KEY STEPS<br>1. Read methylation data (`*.cov` files) using `methRead()`2. Filter CpGs by minimum and maximum coverage thresholds3. Merge all samples into a common methylation object using `unite()`4. Conduct one-tailed binomial tests (p &gt; background) per CpG site for each sample5. Apply Benjamini-Hochberg FDR correction per sample6. Filter for significant sites (FDR &lt; 0.05)7. Collate significant sites across all samples8. Extract a methylBase object from the merged sites9. Export CpG-level data to `Total_Methylated_Bases.txt`10. Generate correlation, clustering, and PCA plots across all retained samples<br>INPUT FILES<br>- `*.CpG_report.merged_CpG_evidence.cov`: Methylation files from Bismark (one per sample)<br>SAMPLES<br>- 40 samples representing combinations of treatment (Control/Diapause) and timepoint (6d to 30d), with replicates- Sample names: D6C1, D6D1, D12C1, ..., D6C4- Treatment vector: `treat_conditions` defines relative order<br>OUTPUT FILES<br>- `step1meth.RData`, `step2meth.RData`, `step3meth.RData`, `finalmeth.RData`: R environments at major steps- `Total_Methylated_Bases.txt`: Final table of significant methylated sites across all samples- `CpG_Correlation.pdf`: Pairwise methylation correlation matrix- `CpG_Cluster.pdf`: Hierarchical clustering dendrogram- `CpG_PCA.pdf`: PCA plot of CpG methylation patterns<br>SOFTWARE REQUIREMENTS<br>- R package: `methylKit` (v1.22.1+ recommended)- Additional packages: `stats`, `utils`, `grDevices` (base R)<br>NOTES<br>- Filtering is performed with `mincov=10` and high-coverage threshold at 99.9th percentile- Binomial tests are one-tailed (test for greater-than-background methylation)- Each sample uses a slightly adjusted p-value threshold (e.g., 0.004–0.005)- Data are not destranded or normalized at this stage<br>LIMITATIONS<br>- Statistical tests are applied per sample; no groupwise differential analysis is performed here- Only CpGs passing FDR &lt; 0.05 are retained- Manual sample-by-sample processing of 40 timepoints; automation is possible for scalability<br><br><br>CONTACT<br>Eamonn Mallon ebm3@le.ac.uk<br>
提供机构:
figshare
创建时间:
2025-05-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作