five

RegEl Database: text-mined regulatory elements from the literature and their associations to genes and disease

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/6418367
下载链接
链接失效反馈
官方服务:
资源简介:
@article{garda2022regel, title={RegEl corpus: identifying DNA regulatory elements in the scientific literature}, author={Garda, Samuele and Lenihan-Geels, Freyda and Proft, Sebastian and Hochmuth, Stefanie and Sch{\"u}lke, Markus and Seelow, Dominik and Leser, Ulf}, journal={Database}, volume={2022}, year={2022}, publisher={Oxford Academic} } # RegEl PubMed Database This database contains the annotations generated by running [HunFlair](https://github.com/flairNLP/flair/blob/master/resources/docs/HUNFLAIR.md) models trained on the [RegEl corpus](https://zenodo.org/record/5776679) over >20M PubMed abstracts. By pairing these annotations with the one provided by PubTator this generates a large text mining database of regulatory elements associated with genes (normalized to NCBI Gene ids) and disease (normalized to either MeSH or OMIM). The tables composing the database are: * abstracts.db:   - pmid = PubMed ID of the given abstracts   - sid = sentence ID of the given abstracts (from 0 to # of sentences)   - text = text of the given sentence * gene.db and disease.db:   - pmid = PubMed ID of the given abstracts   - sid = sentence ID of the given abstracts (from 0 to # of sentences)   - etype = entity type (enhancer, promoter, TFBS)   - ann_text = mention of the regulatory element as found in the abstract   - start = position (# character) in which the mention begins   - end = position (# characters) in which the mention ends   - score = model's confidence   - cui = gene or disease identifier   - cui_symbol = official symbol of cui (if available)
创建时间:
2022-11-28
二维码
社区交流群
二维码
科研交流群
商业服务