konwoo/OctoThinker-d50000-v2-tokens-filter900-subsample
收藏Hugging Face2025-10-17 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/konwoo/OctoThinker-d50000-v2-tokens-filter900-subsample
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含文本内容、ID、cc路径、域名、数学和语言分数、语言类型、时间戳、URL和token数量等字段。提供了训练集和验证集,可用于机器学习模型的训练和验证。
The dataset includes fields such as text content, ID, cc path, domain, math and language scores, language type, timestamp, URL, and token count. It provides training and validation sets for machine learning model training and validation.
提供机构:
konwoo



