five

JingweiNi/train_prm800k_gpt-oss-120b_annotated_qwen3_1.7b_thinking_5000_response_length_2048

收藏
Hugging Face2025-12-17 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/JingweiNi/train_prm800k_gpt-oss-120b_annotated_qwen3_1.7b_thinking_5000_response_length_2048
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集的README内容详细描述了其结构,包括特征、分割和配置。数据集包含以下特征:question(问题)、answer(答案)、input_ids(输入标识符)、reply(回复)、original_index(原始索引)、claims(包含aligned_token_ids(对齐令牌标识符)、claim_text(声明文本)和sentence(句子))以及verified(验证)。数据集只有一个名为train的分割,包含5,000个样本,总大小为184,520,857字节。数据集设计用于训练目的,特征表明它可能用于问答或自然语言处理任务,额外的元数据如claims和verified表明可能用于事实核查或验证场景。

The README content provides a detailed structure of the dataset, including its features, splits, and configuration. The dataset contains features such as question, answer, input_ids, reply, original_index, claims (which includes aligned_token_ids, claim_text, and sentence), and verified. The dataset has a single split named train with 5,000 examples and a total size of 184,520,857 bytes. The dataset is designed for training purposes, as indicated by the train split. The features suggest that the dataset is likely used for question-answering or natural language processing tasks, with additional metadata like claims and verified indicating possible use in fact-checking or verification contexts.
提供机构:
JingweiNi
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作