apertus-sft-mixture
收藏魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/swiss-ai/apertus-sft-mixture
下载链接
链接失效反馈官方服务:
资源简介:
# Apertus Supervised Finetuning Data
Our supervised finetuning data contains a carefully curated blend of instruction-following datasets,
developed through eight iterations of empirical evaluation. This final mixture comprises approximately
3.8 million examples from diverse sources, balancing generalinstruction-following, mathematical reasoning,
code generation, and multilingual capabilities.
More details about data provenance, preparation, and statistics can be found in our [tech report](https://github.com/swiss-ai/apertus-tech-report).
Sampling, filtering and data-preparation scripts can be found in [our dedicated GitHub repository](https://github.com/swiss-ai/posttrain-data).
Feel free to [reach out](mailto:sven.najem-meyer@epfl.ch) for any questions or suggestions 😊
# Apertus监督微调数据集
本监督微调数据集历经八轮实证评估迭代开发,整合了经过精心甄选的指令遵循类数据集。该最终混合数据集包含约380万个来自多元来源的样本,兼顾通用指令遵循、数学推理、代码生成与多语言处理能力。
有关数据集溯源、预处理流程与统计信息的更多细节,请参阅我们的[技术报告](https://github.com/swiss-ai/apertus-tech-report)。
采样、过滤与数据预处理脚本可在[我们的专属GitHub仓库](https://github.com/swiss-ai/posttrain-data)中获取。
如有任何疑问或建议,欢迎随时[联系我们](mailto:sven.najem-meyer@epfl.ch) 😊
提供机构:
maas
创建时间:
2025-09-20



