gokulsrinivasagan/processed_wikitext-103-raw-v1-ld-50
收藏Hugging Face2024-11-18 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/gokulsrinivasagan/processed_wikitext-103-raw-v1-ld-50
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含四个主要特征:input_ids、attention_mask、special_tokens_mask和lda_lables,分别表示输入ID、注意力掩码、特殊标记掩码和LDA标签。数据集分为测试集、训练集和验证集,分别包含549、228639和479个示例。数据集的下载大小为351302173字节,总大小为801078496字节。数据文件路径在配置部分指定。
The dataset contains four main features: input_ids, attention_mask, special_tokens_mask, and lda_lables, representing input IDs, attention masks, special token masks, and LDA labels, respectively. The dataset is divided into test, train, and validation sets, containing 549, 228,639, and 479 examples, respectively. The download size of the dataset is 351,302,173 bytes, and the total size is 801,078,496 bytes. The data file paths are specified in the configuration section.
提供机构:
gokulsrinivasagan



