emmabedna/langtok_dataset

Name: emmabedna/langtok_dataset
Creator: emmabedna
Published: 2025-10-07 15:01:19
License: 暂无描述

Hugging Face2025-10-07 更新2025-10-25 收录

下载链接：

https://hf-mirror.com/datasets/emmabedna/langtok_dataset

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含两个特征字段：tokens和labels，均为字符串类型的序列。数据集分为训练集、验证集和测试集，其中训练集包含27322个示例，验证集和测试集各包含3415和3416个示例。数据集的总大小为15,655,120字节，下载大小为6,318,668字节。

The dataset includes two feature fields: tokens and labels, both of which are sequences of strings. The dataset is divided into training, validation, and test sets, with the training set containing 27,322 examples, and the validation and test sets containing 3,415 and 3,416 examples respectively. The total size of the dataset is 15,655,120 bytes, and the download size is 6,318,668 bytes.

提供机构：

emmabedna

5,000+

优质数据集

54 个

任务类型

进入经典数据集