marin-community/open-thoughts-4-science-qwen3-32b-annotated

Name: marin-community/open-thoughts-4-science-qwen3-32b-annotated
Creator: marin-community
Published: 2025-11-20 19:02:19
License: 暂无描述

Hugging Face2025-11-20 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/marin-community/open-thoughts-4-science-qwen3-32b-annotated

下载链接

链接失效反馈

官方服务：

资源简介：

--- # For reference on dataset card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/datasetcard.md?plain=1 # Doc / guide: https://huggingface.co/docs/hub/datasets-cards {} --- # Dataset Card for Open-Thoughts-4-Science-Qwen3-32B-Annotated  This dataset is the Qwen3-32B annotated version of [mlfoundations-dev/hero_run_4_science](https://huggingface.co/datasets/mlfoundations-dev/hero_run_4_science) curated by the OpenThoughts4 team. We provide the responses from Qwen3-32B in the `generated_text` column. These samples were generated using temperature = 0.8 and max output tokens = 7,500. We note that many of the responses are truncated, so use this dataset wisely! ## Dataset Details ### Dataset Description  - **Curated by:** [More Information Needed] - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] ### Dataset Sources [optional]  - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses  ### Direct Use  [More Information Needed] ### Out-of-Scope Use  [More Information Needed] ## Dataset Structure  [More Information Needed] ## Dataset Creation ### Curation Rationale  [More Information Needed] ### Source Data  #### Data Collection and Processing  [More Information Needed] #### Who are the source data producers?  [More Information Needed] ### Annotations [optional]  #### Annotation process  [More Information Needed] #### Who are the annotators?  [More Information Needed] #### Personal and Sensitive Information  [More Information Needed] ## Bias, Risks, and Limitations  [More Information Needed] ### Recommendations  Users should be made aware of the risks, biases and limitations of the dataset. More information needed for further recommendations. ## Citation [optional]  **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional]  [More Information Needed] ## More Information [optional] [More Information Needed] ## Dataset Card Authors [optional] [More Information Needed] ## Dataset Card Contact [More Information Needed]

# 如需参考数据集卡片元数据规范，请参阅：https://github.com/huggingface/hub-docs/blob/main/datasetcard.md?plain=1 # 文档/指南：https://huggingface.co/docs/hub/datasets-cards {} # Open-Thoughts-4-Science-Qwen3-32B-Annotated 数据集卡片  本数据集是由OpenThoughts4团队整理的[mlfoundations-dev/hero_run_4_science](https://huggingface.co/datasets/mlfoundations-dev/hero_run_4_science)的Qwen3-32B标注版本。我们在`generated_text`（生成文本）列中提供了Qwen3-32B生成的回复。这些样本使用temperature（温度系数）= 0.8以及max output tokens（最大输出Token数）= 7500生成。请注意，多数回复存在截断情况，请谨慎使用本数据集！ ## 数据集详情 ### 数据集描述 - **整理方：** [需补充更多信息] - **资助方（可选）：** [需补充更多信息] - **共享方（可选）：** [需补充更多信息] - **自然语言处理所用语言：** [需补充更多信息] - **许可证：** [需补充更多信息] ### 数据集来源（可选） - **代码仓库：** [需补充更多信息] - **相关论文（可选）：** [需补充更多信息] - **演示Demo（可选）：** [需补充更多信息] ## 使用场景 ### 直接使用 [需补充更多信息] ### 超出范围的使用场景本部分说明不当使用、恶意使用，以及本数据集无法良好适配的使用场景。 [需补充更多信息] ## 数据集结构本部分说明数据集字段，以及数据集拆分标准、数据点间关联等额外结构信息。 [需补充更多信息] ## 数据集创建 ### 整理初衷本部分说明创建该数据集的动机。 [需补充更多信息] ### 源数据 #### 数据收集与处理本部分说明数据收集与处理流程，例如数据选择标准、过滤与归一化方法、所用工具与库等。 [需补充更多信息] #### 源数据生产者是谁？本部分说明原始创建数据的个人或系统。若可获取，还应包含源数据创建者自行申报的人口统计学或身份相关信息。 [需补充更多信息] ### 标注（可选） #### 标注流程本部分说明标注流程，例如标注所用工具、标注数据量、提供给标注者的标注指南、标注者间一致性统计、标注验证等。 [需补充更多信息] #### 标注者是谁？本部分说明创建标注的个人或系统。 [需补充更多信息] #### 个人与敏感信息说明本数据集是否包含可被视为个人、敏感或隐私的数据（例如：揭示地址、唯一可识别姓名或别名、种族或族裔出身、性取向、宗教信仰、政治观点、财务或健康数据等）。若已对数据进行匿名化处理，请说明匿名化流程。 [需补充更多信息] ## 偏差、风险与局限性本部分旨在说明技术与社会技术层面的局限性。 [需补充更多信息] ### 建议本部分旨在针对数据集的偏差、风险与技术局限性给出建议。用户应知晓本数据集存在的风险、偏差与局限性。需补充更多信息以形成进一步建议。 ## 引用（可选）若有介绍该数据集的论文或博客文章，请在此处附上其APA格式与BibTeX格式引用信息。 **BibTeX格式：** [需补充更多信息] **APA格式：** [需补充更多信息] ## 术语表（可选）若有需要，可在此处列出帮助读者理解数据集或数据集卡片的术语与计算公式。 [需补充更多信息] ## 更多信息（可选） [需补充更多信息] ## 数据集卡片作者（可选） [需补充更多信息] ## 数据集卡片联系人 [需补充更多信息]

提供机构：

marin-community

5,000+

优质数据集

54 个

任务类型

进入经典数据集