raymondzmc/stackoverflow_ERNIE-4.5-0.3B-PT_vocab_2000_last

Name: raymondzmc/stackoverflow_ERNIE-4.5-0.3B-PT_vocab_2000_last
Creator: raymondzmc
Published: 2025-12-16 17:15:32
License: 暂无描述

Hugging Face2025-12-16 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/raymondzmc/stackoverflow_ERNIE-4.5-0.3B-PT_vocab_2000_last

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集的结构包括特征如id、context、next_word、next_word_logits、input_embeddings、bow和label。它包含一个名为train的分割，有20,000个样本，总大小为1,722,022,954字节。数据集设计用于涉及自然语言处理的任务，可能专注于下一个单词预测或类似任务，因为存在next_word和next_word_logits特征。

The dataset structure includes features such as id, context, next_word, next_word_logits, input_embeddings, bow, and label. It contains a single split named train with 20,000 examples and a total size of 1,722,022,954 bytes. The dataset is designed for tasks involving natural language processing, likely focusing on next-word prediction or similar tasks given the presence of next_word and next_word_logits features.

提供机构：

raymondzmc

5,000+

优质数据集

54 个

任务类型

进入经典数据集