liy140/multidomain-measextract-corpus
收藏Hugging Face2023-09-12 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/liy140/multidomain-measextract-corpus
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个多领域语料库,用于测量提取(Seq2Seq变体)。它包含了三个数据集(measeval、bm和msp)的训练、验证和测试数据。其中,measeval和msp数据集分别改编自MeasEval(Harper等人,2021)和Material Synthesis Procedual(Mysore等人,2019)语料库。该仓库将msp和measeval的提取任务聚合到段落级别,并以json格式提供标签,以便进行Seq2Seq训练。
提供机构:
liy140
原始信息汇总
多领域测量提取语料库(Seq2Seq变体)
数据集配置
-
measeval
- 训练集:
measeval_paragraph_level_no_spans_train.json - 验证集:
measeval_paragraph_level_no_spans_val.json - 测试集:
measeval_paragraph_level_no_spans_test.json
- 训练集:
-
bm
- 训练集:
bm_paragraph_level_no_spans_train.json - 验证集:
bm_paragraph_level_no_spans_val.json - 测试集:
bm_paragraph_level_no_spans_test.json
- 训练集:
-
msp
- 训练集:
msp_paragraph_level_no_spans_train.json - 验证集:
msp_paragraph_level_no_spans_val.json - 测试集:
msp_paragraph_level_no_spans_test.json
- 训练集:
-
all
- 训练集:
measeval_paragraph_level_no_spans_train.jsonbm_paragraph_level_no_spans_train.jsonmsp_paragraph_level_no_spans_train.json
- 验证集:
measeval_paragraph_level_no_spans_val.jsonbm_paragraph_level_no_spans_val.jsonmsp_paragraph_level_no_spans_val.json
- 测试集:
measeval_paragraph_level_no_spans_test.jsonbm_paragraph_level_no_spans_test.jsonmsp_paragraph_level_no_spans_test.json
- 训练集:
任务类别
- 令牌分类
语言
- 英语
标签
- 化学
- 生物学
数据集大小
- n<1K



