Synthetic nursing handover training and development data set - text files

Name: Synthetic nursing handover training and development data set - text files
Creator: CSIRO
Published: 2020-09-18 19:22:16
License: 暂无描述

DataCite Commons2020-09-18 更新2025-04-09 收录

下载链接：

https://data.csiro.au/collections/#collection/CIcsiro:20413v1

下载链接

链接失效反馈

官方服务：

资源简介：

This is one of two collection records. Please see the link below for the other collection of associated audio files. Both collections together comprise an open clinical dataset of three sets of 101 nursing handover records, very similar to real documents in Australian English. Each record consists of a patient profile, spoken free-form text document, written free-form text document, and written structured document. This collection contains 3 sets of text documents. Data Set 1 for Training and Development The data set, released in June 2014, includes the following documents: Folder initialisation: Initialisation details for speech recognition using Dragon Medical 11.0 (i.e., i) DOCX for the written, free-form text document that originates from the Dragon software release and ii) WMA for the spoken, free-form text document by the RN) Folder 100profiles: 100 patient profiles (DOCX) Folder 101writtenfreetextreports: 101 written, free-form text documents (TXT) Folder 100x6speechrecognised: 100 speech-recognized, written, free-form text documents for six Dragon vocabularies (TXT) Folder 101informationextraction: 101 written, structured documents for information extraction that include i) the reference standard text, ii) features used by our best system, iii) form categories with respect to the reference standard and iv) form categories with respect to the our best information extraction system (TXT in CRF++ format). An Independent Data Set 2 The aforementioned data set was supplemented in April 2015 with an independent set that was used as a test set in the CLEFeHealth 2015 Task 1a on clinical speech recognition and can be used as a validation set in the CLEFeHealth 2016 Task 1 on handover information extraction. Hence, when using this set, please avoid its repeated use in evaluation – we do not wish to overfit to these data sets. The set released in April 2015 consists of 100 patient profiles (DOCX), 100 written, and 100 speech-recognized, written, free-form text documents for the Dragon vocabulary of Nursing (TXT). The set released in November 2015 consists of the respective 100 written free-form text documents (TXT) and 100 written, structured documents for information extraction. An Independent Data Set 3 For evaluation purposes, the aforementioned data sets were supplemented in April 2016 with an independent set of another 100 synthetic cases.

提供机构：

CSIRO

创建时间：

2017-03-21

5,000+

优质数据集

54 个

任务类型

进入经典数据集