five

HG38_PD_PHannotations.xlsx

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/HG38_PD_PHannotations_xlsx/27931557
下载链接
链接失效反馈
官方服务:
资源简介:
This submission contain gene annotations for an Illumina microarray (HorvathMammalMethylChip40) for the human genome (HG38) and the genomes of two species of bats, Phyllostomus discolor (HLphyDis3) and P. hastatus (TTU_PhHast_1.1). The array design is available from the Gene Expression Omnibus (GEO) at NCBI as platform GPL28271. The alignment was done using the QUASR package (Gaidatzis et al., 2015) with the assumption for bisulfite conversion treatment of the genomic DNA. For each species’ genome sequence, QUASR creates an in-silico-bisulfite-treated version of the genome. The set of nucleotide sequences of the designed probes, which includes degenerate base positions due to the bisulfite conversion, was expanded into a larger set of nucleotide sequences representing every possible combination of degenerate bases. We then ran QUASR (a wrapper for Bowtie2) with parameters -k 2 --strata --best -v 3 and bisulfite = "undir” to align the enlarged set of probe sequences to each prepared genome. From these files, we collected only alignments where the entire length of the probe perfectly matched to the genome sequence (i.e. the CIGAR string 50M and flag XM=0). Following the alignment, the CpGs were annotated based on the distance to the closest transcriptional start site using the Chipseeker package (Yu et al., 2015). A gff file with these was created using these positions, sorted by scaffold and position, and compared to the location of each probe in BAM format. We report probes whose variants only mapped to one unique locus in a particular genome. Genomic location of each CpG is categorized as intergenic, 3’ UTR, 5’ UTR, promoter region (minus 10 kb to plus 1000 bp from the nearest TSS), exon, or intron. The file includes the location of androgen sensitive sites as described by Sugrue et al. (2021). Gaidatzis, D., Lerch, A., Hahne, F., and Stadler, M.B. (2015). QuasR: quantification and annotation of short reads in R. Bioinformatics 31, 1130-1132. Sugrue, V. J., Zoller, J. A., Narayan, P., Lu, A. T., Ortega-Recalde, O. J., Grant, M. J., . . . Horvath, S. (2021). Castration delays epigenetic aging and feminizes DNA methylation at androgen-regulated loci. Elife, 10. doi:10.7554/eLife.64932 Yu, G., Wang, L.G., and He, Q.Y. (2015). ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics 31, 2382-2383.
创建时间:
2024-12-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作