SOP-Maze
收藏魔搭社区2026-01-07 更新2025-11-15 收录
下载链接:
https://modelscope.cn/datasets/meituan-longcat/SOP-Maze
下载链接
链接失效反馈官方服务:
资源简介:
SOP-Maze is a benchmark designed to evaluate the comprehensive capabilities of large language models (LLMs) in executing tasks that follow Standard Operating Procedures (SOPs).
SOP-Maze 是一款旨在评估大语言模型(Large Language Models,LLMs)执行遵循标准作业程序(Standard Operating Procedures,SOPs)相关任务综合能力的基准测试集。
提供机构:
maas
创建时间:
2025-11-14



