Eureka-Bench-Logs
收藏魔搭社区2026-01-07 更新2025-07-26 收录
下载链接:
https://modelscope.cn/datasets/microsoft/Eureka-Bench-Logs
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains the logs from the Eureka ML Insights framework, as described in the [Github repository](https://github.com/microsoft/eureka-ml-insights).
Relevant links:
- Technical report - [Eureka: Evaluating and Understanding Large Foundation Models](https://arxiv.org/abs/2409.10566)
- Technical report - [Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead](https://arxiv.org/abs/2504.00294)
- Technical report - [Phi-4-reasoning technical report](https://arxiv.org/abs/2504.21318)
- [Project Website](https://microsoft.github.io/eureka-ml-insights)
All logs are organized by benchmark, model family, and model name.
For any questions on the logs or evaluation code, please contact eureka-ml-insights@microsoft.com.
本仓库收录了Eureka ML Insights框架的运行日志,相关细节可参阅其[GitHub仓库](https://github.com/microsoft/eureka-ml-insights)。
相关链接如下:
- 技术报告——《Eureka:大基础模型的评估与认知》(https://arxiv.org/abs/2409.10566)
- 技术报告——《复杂任务的推理时缩放:当前进展与未来展望》(https://arxiv.org/abs/2504.00294)
- 技术报告——《Phi-4-reasoning技术报告》(https://arxiv.org/abs/2504.21318)
- 项目官网(https://microsoft.github.io/eureka-ml-insights)
所有日志均按照评测基准、模型家族与模型名称进行分类编排。
若您对日志或评估代码有任何疑问,请联系邮箱eureka-ml-insights@microsoft.com。
提供机构:
maas
创建时间:
2025-07-22



