five

Effective Crowdsourcing of Multiple Tasks for Comprehensive Information Extraction

收藏
DataCite Commons2025-04-01 更新2024-07-27 收录
下载链接:
https://figshare.com/articles/dataset/Effective_Crowdsourcing_of_Multiple_Tasks_for_Comprehensive_Information_Extraction/7935185/1
下载链接
链接失效反馈
官方服务:
资源简介:
<b>Introduction</b>This dataset aims to propose a Korean information extraction standard and promote research in this field by presenting crowdsourcing data collected for four information extraction tasks from the same corpus and the training and evaluation results for each task of a state-of-the-art model. These machine learning data for Korean information extraction are the first of their kind, and there are plans to continuously increase the data volume. The test results will serve as a standard for each Korean information extraction task and are expected to serve as a comparison target for various studies on Korean information extraction using the data collected in this study. The dataset is available for research purposes.<br><br><b>Description</b> - There are two crowdsourcing .zip files; wiki-10000-part1&amp;2.zip. In each file, 1) task1-1 : Entity Detection2) task1-2 : Entity Linking3) task2 : co-reference resolution4) task4 : relation extraction<b><br></b> - For an entity linking model(https://github.com/machinereading/eld-2018), here is a pre-trained embedding files in el-korean.tar.gz<br>- For an co-reference resolution model(https://github.com/machinereading/CR), here is a pre-trained embedding files in cr-korean.tar.gz<br><br>- For a relation extraction model(https://github.com/machinereading/re-gan), here is a corpus, dataset and pre-trained embedding files in ko-gan-data.zip<br><br>- For a relation extraction model(https://github.com/machinereading/re-re-RL-Crowd), here is a pre-trained embedding files in rerl-korean.tar.gz<br><br><b>How to use</b>All crowdsourcing file are in JSON format. Detail example and usage are in here (https://github.com/machinereading/okbqa-7-task4)
提供机构:
figshare
创建时间:
2019-04-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作