mlfoundations-dev/fineweb_seed_science
收藏Hugging Face2025-03-07 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/mlfoundations-dev/fineweb_seed_science
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含多个字段的数据集,主要用于训练机器学习模型。数据集包含的字段有:分数(score)、文本(text)、网址(url)、年份(year)、问题(problem)、原始行索引(__original_row_idx)、推理过程(reasoning)、DeepSeek解决方案(deepseek_solution)、最终推理轨迹(final_reasoning_trace)和对话(conversations)。对话字段又包含对话来源(from)和对话内容(value)。训练集共有5000个示例,整个数据集的大小为197309980字节。
This is a dataset with multiple fields designed for training machine learning models. The dataset includes fields such as score, text, URL, year, problem, original row index (__original_row_idx), reasoning process, DeepSeek solution, final reasoning trace, and conversation. The conversation field contains the source of the conversation (from) and the content of the conversation (value). The training set has a total of 5000 examples, and the entire dataset is 197309980 bytes in size.
提供机构:
mlfoundations-dev



