BIOMRC
收藏arXiv2020-05-13 更新2024-06-21 收录
下载链接:
http://nlp.cs.aueb.gr/ publications.html
下载链接
链接失效反馈官方服务:
资源简介:
BIOMRC是由雅典经济与商业大学信息学院创建的大型生物医学阅读理解数据集,旨在减少与之前BIOREAD数据集相比的噪声。该数据集包含约81.2万条实例,分为三个版本:LARGE(812k)、LITE(100k)和TINY(60),用于不同资源的研究者。BIOMRC使用生物医学文章的摘要和标题作为阅读材料,通过替换标题中的生物医学实体来生成闭合式问题,要求系统从摘要中找出正确的实体。该数据集特别适用于训练或预训练深度学习模型,以解决生物医学领域的阅读理解问题。
BIOMRC is a large-scale biomedical machine reading comprehension dataset created by the School of Informatics, Athens University of Economics and Business, aiming to reduce noise compared to the previous BIOREAD dataset. It contains approximately 812,000 instances, which are divided into three versions: LARGE (812k), LITE (100k), and TINY (60), to cater to researchers with different computational resource constraints. BIOMRC uses the abstracts and titles of biomedical articles as reading materials, generates closed-ended questions by replacing biomedical entities in the titles, and requires the system to extract the correct entities from the corresponding abstracts. This dataset is particularly suitable for training or pre-training deep learning models to solve machine reading comprehension tasks in the biomedical domain.
提供机构:
雅典经济与商业大学信息学院
创建时间:
2020-05-13



