raymondzmc/20_newsgroups_Llama-3.2-1B-Instruct_vocab_2000_last
收藏Hugging Face2025-12-16 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/raymondzmc/20_newsgroups_Llama-3.2-1B-Instruct_vocab_2000_last
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含用于自然语言处理任务的结构化数据,特征包括ID、上下文、下一个词、下一个词的对数概率、输入嵌入、词袋表示和标签。数据集包含18846个训练样本,总大小约为2.8GB。
This dataset contains structured data for natural language processing tasks, featuring ID, context, next word, next word logits, input embeddings, bag-of-words representation, and labels. The dataset includes 18,846 training examples with a total size of approximately 2.8GB.
提供机构:
raymondzmc



