stephen-fry-voice

Hugging Face2026-03-25 更新2026-03-26 收录

下载链接：

https://huggingface.co/datasets/siaison/stephen-fry-voice

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含用于序列标注任务的训练数据，包含1,425个样本。数据特征包括：input_ids（int32列表）、labels（int64列表）和attention_mask（int8列表），这些是典型的Transformer模型输入特征。数据集仅提供训练集（train split），未压缩大小为10.7MB。从特征结构推断，可能适用于文本分类、命名实体识别等自然语言处理任务，但具体应用领域需结合其他文档确认。

This dataset contains training data for sequence labeling tasks, with a total of 1,425 samples. The data features include input_ids (int32 list), labels (int64 list), and attention_mask (int8 list), which are typical input features for Transformer models. Only the training split is provided in this dataset, with an uncompressed size of 10.7 MB. Based on the feature structure, it may be applicable to natural language processing tasks such as text classification and named entity recognition, but the specific application scenarios need to be confirmed with additional documentation.

创建时间：

2026-03-24

原始信息汇总

数据集概述

基本信息

数据集名称: stephen-fry-voice
存储库地址: https://huggingface.co/datasets/siaison/stephen-fry-voice
下载大小: 13,275,332 字节
数据集大小: 10,782,309 字节

数据特征

数据集包含以下三个特征字段：

input_ids: 数据类型为 list[int32]
labels: 数据类型为 list[int64]
attention_mask: 数据类型为 list[int8]

数据划分

训练集 (train):
- 样本数量: 1,425 个
- 数据大小: 10,782,309 字节

配置文件

配置名称: default
数据文件:
- 划分: train
- 路径: data/train-*

5,000+

优质数据集

54 个

任务类型

进入经典数据集