gokulsrinivasagan/processed_book_corpus-ld-50
收藏Hugging Face2024-12-05 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/gokulsrinivasagan/processed_book_corpus-ld-50
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含用于自然语言处理任务的特征,如input_ids、attention_mask、special_tokens_mask和lda_lables。数据集分为训练集和验证集,训练集包含2,277,342个样本,验证集包含120,706个样本。这些特征通常用于文本分类或主题建模等任务。
This dataset contains features for natural language processing tasks, such as input_ids, attention_mask, special_tokens_mask, and lda_lables. The dataset is divided into a training set and a validation set, with the training set containing 2,277,342 samples and the validation set containing 120,706 samples. These features are typically used for tasks such as text classification or topic modeling.
提供机构:
gokulsrinivasagan



