stephen-fry-voice
收藏Hugging Face2026-03-25 更新2026-03-26 收录
下载链接:
https://huggingface.co/datasets/siaison/stephen-fry-voice
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含用于序列标注任务的训练数据,包含1,425个样本。数据特征包括:input_ids(int32列表)、labels(int64列表)和attention_mask(int8列表),这些是典型的Transformer模型输入特征。数据集仅提供训练集(train split),未压缩大小为10.7MB。从特征结构推断,可能适用于文本分类、命名实体识别等自然语言处理任务,但具体应用领域需结合其他文档确认。
This dataset contains training data for sequence labeling tasks, with a total of 1,425 samples. The data features include input_ids (int32 list), labels (int64 list), and attention_mask (int8 list), which are typical input features for Transformer models. Only the training split is provided in this dataset, with an uncompressed size of 10.7 MB. Based on the feature structure, it may be applicable to natural language processing tasks such as text classification and named entity recognition, but the specific application scenarios need to be confirmed with additional documentation.
创建时间:
2026-03-24
原始信息汇总
数据集概述
基本信息
- 数据集名称: stephen-fry-voice
- 存储库地址: https://huggingface.co/datasets/siaison/stephen-fry-voice
- 下载大小: 13,275,332 字节
- 数据集大小: 10,782,309 字节
数据特征
数据集包含以下三个特征字段:
- input_ids: 数据类型为
list[int32] - labels: 数据类型为
list[int64] - attention_mask: 数据类型为
list[int8]
数据划分
- 训练集 (train):
- 样本数量: 1,425 个
- 数据大小: 10,782,309 字节
配置文件
- 配置名称: default
- 数据文件:
- 划分: train
- 路径: data/train-*



