raymondzmc/stackoverflow_ERNIE-4.5-0.3B-PT_vocab_2000_last
收藏Hugging Face2025-12-16 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/raymondzmc/stackoverflow_ERNIE-4.5-0.3B-PT_vocab_2000_last
下载链接
链接失效反馈官方服务:
资源简介:
该数据集的结构包括特征如id、context、next_word、next_word_logits、input_embeddings、bow和label。它包含一个名为train的分割,有20,000个样本,总大小为1,722,022,954字节。数据集设计用于涉及自然语言处理的任务,可能专注于下一个单词预测或类似任务,因为存在next_word和next_word_logits特征。
The dataset structure includes features such as id, context, next_word, next_word_logits, input_embeddings, bow, and label. It contains a single split named train with 20,000 examples and a total size of 1,722,022,954 bytes. The dataset is designed for tasks involving natural language processing, likely focusing on next-word prediction or similar tasks given the presence of next_word and next_word_logits features.
提供机构:
raymondzmc



