Brendan/multiwoz_turns_v22
收藏Hugging Face2023-11-11 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Brendan/multiwoz_turns_v22
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: validation
path: data/validation-*
- split: test
path: data/test-*
- split: valid_20p_ablation
path: data/valid_20p_ablation-*
- split: valid_10p
path: data/valid_10p-*
- split: valid_50p
path: data/valid_50p-*
- split: 1p_train_v1
path: data/1p_train_v1-*
- split: 1p_train_v2
path: data/1p_train_v2-*
- split: 1p_train_v3
path: data/1p_train_v3-*
- split: 5p_train_v1
path: data/5p_train_v1-*
- split: 5p_train_v2
path: data/5p_train_v2-*
- split: 5p_train_v3
path: data/5p_train_v3-*
- split: 10p_train_v1
path: data/10p_train_v1-*
- split: 10p_train_v2
path: data/10p_train_v2-*
- split: 10p_train_v3
path: data/10p_train_v3-*
- split: train_evaluable_only
path: data/train_evaluable_only-*
- split: valid_evaluable_only
path: data/valid_evaluable_only-*
dataset_info:
features:
- name: dialogue_id
dtype: string
- name: turn_id
dtype: int8
- name: domains
sequence: string
- name: system_utterances
sequence: string
- name: user_utterances
sequence: string
- name: slot_values
struct:
- name: hotel
struct:
- name: price range
dtype: string
- name: type
dtype: string
- name: parking
dtype: string
- name: book day
dtype: string
- name: book people
dtype: string
- name: book stay
dtype: string
- name: stars
dtype: string
- name: internet
dtype: string
- name: name
dtype: string
- name: area
dtype: string
- name: train
struct:
- name: arrive by
dtype: string
- name: departure
dtype: string
- name: day
dtype: string
- name: book people
dtype: string
- name: leave at
dtype: string
- name: destination
dtype: string
- name: attraction
struct:
- name: area
dtype: string
- name: name
dtype: string
- name: type
dtype: string
- name: restaurant
struct:
- name: price range
dtype: string
- name: area
dtype: string
- name: food
dtype: string
- name: name
dtype: string
- name: book day
dtype: string
- name: book people
dtype: string
- name: book time
dtype: string
- name: hospital
struct:
- name: department
dtype: string
- name: taxi
struct:
- name: leave at
dtype: string
- name: destination
dtype: string
- name: departure
dtype: string
- name: arrive by
dtype: string
- name: bus
struct:
- name: departure
dtype: string
- name: destination
dtype: string
- name: leave at
dtype: string
- name: day
dtype: string
- name: police
struct:
- name: name
dtype: string
- name: turn_slot_values
struct:
- name: hotel
struct:
- name: price range
dtype: string
- name: type
dtype: string
- name: parking
dtype: string
- name: book day
dtype: string
- name: book people
dtype: string
- name: book stay
dtype: string
- name: stars
dtype: string
- name: internet
dtype: string
- name: name
dtype: string
- name: area
dtype: string
- name: train
struct:
- name: arrive by
dtype: string
- name: departure
dtype: string
- name: day
dtype: string
- name: book people
dtype: string
- name: leave at
dtype: string
- name: destination
dtype: string
- name: attraction
struct:
- name: area
dtype: string
- name: name
dtype: string
- name: type
dtype: string
- name: restaurant
struct:
- name: price range
dtype: string
- name: area
dtype: string
- name: food
dtype: string
- name: name
dtype: string
- name: book day
dtype: string
- name: book people
dtype: string
- name: book time
dtype: string
- name: hospital
struct:
- name: department
dtype: string
- name: taxi
struct:
- name: leave at
dtype: string
- name: destination
dtype: string
- name: departure
dtype: string
- name: arrive by
dtype: string
- name: bus
struct:
- name: departure
dtype: string
- name: destination
dtype: string
- name: leave at
dtype: string
- name: day
dtype: string
- name: police
struct:
- name: name
dtype: string
- name: last_slot_values
struct:
- name: hotel
struct:
- name: price range
dtype: string
- name: type
dtype: string
- name: parking
dtype: string
- name: book day
dtype: string
- name: book people
dtype: string
- name: book stay
dtype: string
- name: stars
dtype: string
- name: internet
dtype: string
- name: name
dtype: string
- name: area
dtype: string
- name: train
struct:
- name: arrive by
dtype: string
- name: departure
dtype: string
- name: day
dtype: string
- name: book people
dtype: string
- name: leave at
dtype: string
- name: destination
dtype: string
- name: attraction
struct:
- name: area
dtype: string
- name: name
dtype: string
- name: type
dtype: string
- name: restaurant
struct:
- name: price range
dtype: string
- name: area
dtype: string
- name: food
dtype: string
- name: name
dtype: string
- name: book day
dtype: string
- name: book people
dtype: string
- name: book time
dtype: string
- name: hospital
struct:
- name: department
dtype: string
- name: taxi
struct:
- name: leave at
dtype: string
- name: destination
dtype: string
- name: departure
dtype: string
- name: arrive by
dtype: string
- name: bus
struct:
- name: departure
dtype: string
- name: destination
dtype: string
- name: leave at
dtype: string
- name: day
dtype: string
- name: police
struct:
- name: name
dtype: string
- name: last_system_response_acts
sequence: string
- name: system_response_acts
sequence: string
- name: system_response
dtype: string
splits:
- name: train
num_bytes: 84139088
num_examples: 56776
- name: validation
num_bytes: 11271758
num_examples: 7374
- name: test
num_bytes: 11295224
num_examples: 7372
- name: valid_20p_ablation
num_bytes: 2273000.2910225117
num_examples: 1487
- name: valid_10p
num_bytes: 1114335.7176566315
num_examples: 729
- name: valid_50p
num_bytes: 5667979.2058584215
num_examples: 3708
- name: 1p_train_v1
num_bytes: 798770.0512892772
num_examples: 539
- name: 1p_train_v2
num_bytes: 890650.8364097506
num_examples: 601
- name: 1p_train_v3
num_bytes: 861011.8734676624
num_examples: 581
- name: 5p_train_v1
num_bytes: 4245781.441454136
num_examples: 2865
- name: 5p_train_v2
num_bytes: 4103514.419332112
num_examples: 2769
- name: 5p_train_v3
num_bytes: 4220588.32295336
num_examples: 2848
- name: 10p_train_v1
num_bytes: 8368561.186698605
num_examples: 5647
- name: 10p_train_v2
num_bytes: 8447104.438495139
num_examples: 5700
- name: 10p_train_v3
num_bytes: 8398200.149640692
num_examples: 5667
- name: train_evaluable_only
num_bytes: 83498886.4004509
num_examples: 56344
- name: valid_evaluable_only
num_bytes: 11261057.931380527
num_examples: 7367
download_size: 39840521
dataset_size: 250855512.26610973
---
# Dataset Card for "multiwoz_turns_v22"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
Brendan
原始信息汇总
数据集概述
数据集配置
- 默认配置:
- 训练集:路径为
data/train-* - 验证集:路径为
data/validation-* - 测试集:路径为
data/test-* - 其他分割:
valid_20p_ablation:路径为data/valid_20p_ablation-*valid_10p:路径为data/valid_10p-*valid_50p:路径为data/valid_50p-*1p_train_v1:路径为data/1p_train_v1-*1p_train_v2:路径为data/1p_train_v2-*1p_train_v3:路径为data/1p_train_v3-*5p_train_v1:路径为data/5p_train_v1-*5p_train_v2:路径为data/5p_train_v2-*5p_train_v3:路径为data/5p_train_v3-*10p_train_v1:路径为data/10p_train_v1-*10p_train_v2:路径为data/10p_train_v2-*10p_train_v3:路径为data/10p_train_v3-*train_evaluable_only:路径为data/train_evaluable_only-*valid_evaluable_only:路径为data/valid_evaluable_only-*
- 训练集:路径为
数据集特征
- 基本特征:
dialogue_id:对话ID,类型为字符串turn_id:回合ID,类型为整数domains:领域,类型为字符串序列system_utterances:系统话语,类型为字符串序列user_utterances:用户话语,类型为字符串序列slot_values:槽值,类型为结构体,包含多个领域及其对应的槽值turn_slot_values:回合槽值,类型为结构体,包含多个领域及其对应的槽值last_slot_values:上一回合槽值,类型为结构体,包含多个领域及其对应的槽值last_system_response_acts:上一系统响应动作,类型为字符串序列system_response_acts:系统响应动作,类型为字符串序列system_response:系统响应,类型为字符串
数据集分割
- 训练集:
- 字节数:84139088
- 样本数:56776
- 验证集:
- 字节数:11271758
- 样本数:7374
- 测试集:
- 字节数:11295224
- 样本数:7372
- 其他分割:
valid_20p_ablation:- 字节数:2273000.2910225117
- 样本数:1487
valid_10p:- 字节数:1114335.7176566315
- 样本数:729
valid_50p:- 字节数:5667979.2058584215
- 样本数:3708
1p_train_v1:- 字节数:798770.0512892772
- 样本数:539
1p_train_v2:- 字节数:890650.8364097506
- 样本数:601
1p_train_v3:- 字节数:861011.8734676624
- 样本数:581
5p_train_v1:- 字节数:4245781.441454136
- 样本数:2865
5p_train_v2:- 字节数:4103514.419332112
- 样本数:2769
5p_train_v3:- 字节数:4220588.32295336
- 样本数:2848
10p_train_v1:- 字节数:8368561.186698605
- 样本数:5647
10p_train_v2:- 字节数:8447104.438495139
- 样本数:5700
10p_train_v3:- 字节数:8398200.149640692
- 样本数:5667
train_evaluable_only:- 字节数:83498886.4004509
- 样本数:56344
valid_evaluable_only:- 字节数:11261057.931380527
- 样本数:7367
数据集大小
- 下载大小:39840521 字节
- 数据集大小:250855512.26610973 字节



