Dnsibu/serial2023
收藏Hugging Face2023-11-02 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Dnsibu/serial2023
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
dataset_info:
features:
- name: 'Sentence #'
dtype: string
- name: Word
dtype: string
- name: POS
dtype: string
- name: Tag
dtype:
class_label:
names:
'0': O
'1': B-serial
splits:
- name: train
num_bytes: 24256517
num_examples: 836762
- name: test
num_bytes: 6076775
num_examples: 209191
download_size: 6868292
dataset_size: 30333292
---
# Dataset Card for "serial2023"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
The dataset serial2023 includes two configurations: default. The data files are divided into training and test sets, stored in data/train-* and data/test-* paths respectively. The dataset features include Sentence #, Word, POS, and Tag, where Tag is a class label with two categories: O and B-serial. The dataset is split into training and test sets, with 836762 samples in the training set and 209191 samples in the test set. The download size of the dataset is 6868292 bytes, and the total size is 30333292 bytes.
提供机构:
Dnsibu
原始信息汇总
数据集概述
配置
- 默认配置 (
default)- 数据文件路径:
- 训练集 (
train):data/train-* - 测试集 (
test):data/test-*
- 训练集 (
- 数据文件路径:
数据集信息
-
特征:
Sentence #: 字符串类型Word: 字符串类型POS: 字符串类型Tag: 类别标签- 标签名称:
0: O1: B-serial
- 标签名称:
-
数据集划分:
- 训练集 (
train)- 字节数: 24256517
- 样本数: 836762
- 测试集 (
test)- 字节数: 6076775
- 样本数: 209191
- 训练集 (
-
数据集大小:
- 下载大小: 6868292 字节
- 数据集总大小: 30333292 字节



