Bangla-REX: A Distinct Dataset for Relation Extraction
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/m4r5nkbm9c
下载链接
链接失效反馈官方服务:
资源简介:
The dataset is grounded in theoretical and methodological frameworks that emphasize the importance of structured knowledge bases and annotated corpora for effective relation extraction. To generate this dataset, we compiled a comprehensive Bangla Knowledge Base (KB) consisting of 63,256 entries, which serves as a foundation for automating the labeling process with relation tags. The corpus itself is extensive, comprising 90,441 text entries that have been meticulously processed to include Named Entity Recognition (NER) and Part-of-Speech (POS) tagging, ensuring that it is ready for immediate use in relation extraction tasks.
Additionally, we developed mnemonics for 440 distinct locations in Bangla, specifically tailored to enhance performance in location-based relation extraction. These mnemonics are particularly beneficial in the context of distant supervision-based relation extraction, where they help in establishing clear associations between locations and their corresponding entities or contexts.
创建时间:
2024-07-11



