speedcell4/wikitext-103
收藏Hugging Face2024-08-08 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/speedcell4/wikitext-103
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含两个配置:clean和raw。clean配置包含3,805,842个训练样本、8,139个验证样本和9,421个测试样本,总大小为834,650,178字节。raw配置包含29,567个训练样本、60个验证样本和62个测试样本,总大小为521,983,100字节。两个配置的主要特征均为文本数据。
The dataset contains two configurations: clean and raw. The clean configuration includes 3,805,842 training samples, 8,139 validation samples, and 9,421 test samples, with a total size of 834,650,178 bytes. The raw configuration includes 29,567 training samples, 60 validation samples, and 62 test samples, with a total size of 521,983,100 bytes. The main feature of both configurations is text data.
提供机构:
speedcell4



