five

MedNLI for Shared Task at ACL BioNLP 2019

收藏
physionet.org2025-01-15 收录
下载链接:
https://physionet.org/content/mednli-bionlp19/1.0.1/
下载链接
链接失效反馈
官方服务:
资源简介:
Natural Language Inference (NLI) is the task of determining whether a given hypothesis can be inferred from a given premise. Also known as Recognizing Textual Entailment (RTE), this task has enjoyed popularity among researchers for some time. However, almost all datasets for this task focused on open domain data such as as news texts, blogs, and so on. To address this gap, the MedNLI dataset was created for language inference in the medical domain. MedNLI is a derived dataset with data sourced from MIMIC-III v1.4. In order to stimulate research for this problem, a shared task on Medical Inference and Question Answering (MEDIQA) was organized at the workshop for biomedical natural language processing (BioNLP) 2019. The dataset provided herein is a test set of 405 premise hypothesis pairs for the NLI challenge in the MEDIQA shared task. Participants of the shared task are expected to use the MedNLI data for development of their models and this dataset was used as an unseen dataset for scoring each participant submission.

自然语言推理(NLI)是一项确定给定假设是否可以从给定前提中推断出来的任务。亦被称为文本蕴含识别(RTE),这一任务在研究者中颇受欢迎。然而,几乎所有用于此任务的数据集都聚焦于开放域数据,如新闻报道、博客等。为填补这一空白,MedNLI数据集应运而生,旨在用于医学领域的语言推理。MedNLI是一个派生数据集,其数据来源于MIMIC-III v1.4版本。为了激发对此问题的研究,2019年生物医学自然语言处理(BioNLP)研讨会组织了一次关于医学推理与问答(MEDIQA)的共享任务。本提供的数据集是MEDIQA共享任务中NLI挑战的测试集,包含405个前提-假设对。共享任务的参与者预期将使用MedNLI数据来开发他们的模型,并且该数据集被用作评分每个参与者提交的未见数据集。
提供机构:
physionet.org
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作