VincentUni/UMTCsft
收藏Hugging Face2024-05-24 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/VincentUni/UMTCsft
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: DRCD
features:
- name: messages
list:
- name: role
dtype: string
- name: content
dtype: string
- name: other_metadata_field
dtype: string
splits:
- name: train
num_bytes: 14992414
num_examples: 12032
download_size: 1776839
dataset_size: 14992414
- config_name: HR_law
features:
- name: messages
list:
- name: role
dtype: string
- name: content
dtype: string
- name: other_metadata_field
dtype: string
splits:
- name: train
num_bytes: 9387906
num_examples: 647
download_size: 330598
dataset_size: 9387906
- config_name: code
features:
- name: messages
list:
- name: role
dtype: string
- name: content
dtype: string
- name: other_metadata_field
dtype: string
splits:
- name: train
num_bytes: 199494
num_examples: 150
download_size: 89264
dataset_size: 199494
- config_name: process
features:
- name: messages
list:
- name: role
dtype: string
- name: content
dtype: string
- name: other_metadata_field
dtype: string
splits:
- name: train
num_bytes: 79391934
num_examples: 32110
download_size: 6005023
dataset_size: 79391934
configs:
- config_name: DRCD
data_files:
- split: train
path: DRCD/train-*
- config_name: HR_law
data_files:
- split: train
path: HR_law/train-*
- config_name: code
data_files:
- split: train
path: code/train-*
- config_name: process
data_files:
- split: train
path: process/train-*
---
提供机构:
VincentUni
原始信息汇总
数据集概述
数据集配置
DRCD
- 特征:
messages:role: 字符串类型content: 字符串类型
other_metadata_field: 字符串类型
- 分割:
train:- 字节数: 14992414
- 样本数: 12032
- 下载大小: 1776839 字节
- 数据集大小: 14992414 字节
HR_law
- 特征:
messages:role: 字符串类型content: 字符串类型
other_metadata_field: 字符串类型
- 分割:
train:- 字节数: 9387906
- 样本数: 647
- 下载大小: 330598 字节
- 数据集大小: 9387906 字节
code
- 特征:
messages:role: 字符串类型content: 字符串类型
other_metadata_field: 字符串类型
- 分割:
train:- 字节数: 199494
- 样本数: 150
- 下载大小: 89264 字节
- 数据集大小: 199494 字节
process
- 特征:
messages:role: 字符串类型content: 字符串类型
other_metadata_field: 字符串类型
- 分割:
train:- 字节数: 79391934
- 样本数: 32110
- 下载大小: 6005023 字节
- 数据集大小: 79391934 字节



