DNA methylation reference datasets for multiple cell line gDNA samples and trained Autogluon model for MethCali
收藏Zenodo2025-09-26 更新2026-05-26 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.17197026
下载链接
链接失效反馈官方服务:
资源简介:
Since the reference datasets are too large to open in text editors or Microsoft Excel, we here provide the annotated reference datasets in binary format.
The D5/D6/F7/M8/BC/BL.paruqet.zst files contain the reference datasets of corrsponding cell line gDNA samples, while the HF.parquet.zst contains the curated cytosines only. These files can be loaded into RAM using mainstream python data analysis libraries, such as pyarrow, polars, pandas, pyspark, and narwhals, and corrsponding libs in other programming languages, such as R libs: arrow, polars, nanoparquet, etc.
The "beta_pyro" column contains the beta values of cytosines calibrated by the pyrosequencing, while the "beta" column contains the original beta values derived from our NGS reference datasets.
models.tar.zst contains two model directories which can be used with MethCali to calibrate the beta values in NGS bedgraph files.
For more info of the datasets and models, see https://github.com/YuanfengZhang/dna_methylation_smk and https://github.com/YuanfengZhang/MethCali.
提供机构:
Zenodo
创建时间:
2025-09-26



