hyungjikim/wikitext-tags-roberta-v3
收藏Hugging Face2025-10-19 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/hyungjikim/wikitext-tags-roberta-v3
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含训练和验证两个部分,数据特征包括输入ID、注意力掩码、终端和非终端ID、依赖关系ID等。数据集总大小约为87008亿字节,其中训练集大小约为861亿字节,包含约373万个样本;验证集大小约为870亿字节,包含约37720个样本。
The dataset includes training and validation splits, with features such as input IDs, attention masks, terminal and non-terminal IDs, dependency relation IDs, etc. The total size of the dataset is approximately 87008 billion bytes, with the training set being about 861 billion bytes and containing about 3.73 million samples, while the validation set is about 870 billion bytes and contains about 37720 samples.
提供机构:
hyungjikim



