UrbanLegendXV/SolomonVoice1-deux-tokenized
收藏Hugging Face2025-04-02 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/UrbanLegendXV/SolomonVoice1-deux-tokenized
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了输入ID序列(input_ids)、标签序列(labels)和注意力掩码(attention_mask)三种特征。输入ID可能是文本数据经过某种编码方式转换得到的整数序列,标签序列可能表示某种分类或回归任务的标签,注意力掩码则用于指示序列中的有效位置。数据集分为训练集,共有127个样本,总大小为1008673字节。数据集配置信息指出,训练数据存储在data/train-*的路径下。
The dataset consists of three features: a sequence of input IDs (input_ids), a sequence of labels (labels), and an attention mask (attention_mask). The input IDs might be integer sequences obtained by encoding some text data, the label sequence may represent the labels for a classification or regression task, and the attention mask is used to indicate valid positions in the sequence. The dataset is split into a training set with a total of 127 samples and a size of 1008673 bytes. The dataset configuration indicates that the training data is stored in files matching the path data/train-*.
提供机构:
UrbanLegendXV



