"Power-Flow Benchmark for LLM-based Power System Agent Evaluation (PFBench)"
收藏DataCite Commons2026-03-18 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/power-flow-benchmark-llm-based-power-system-agent-evaluation-pfbench
下载链接
链接失效反馈官方服务:
资源简介:
"PFBench is a reproducible benchmark dataset for power-flow reasoning, structured output generation, and tool-using power-system AI. This release packages frozen scenario records and benchmark question items derived from standard transmission test cases under deterministic perturbations. Each scenario record stores the base-grid reference, mutation specification, full post-mutation input state, AC and DC solver outputs, provenance, and explicit data-quality flags that preserve inherited source-case artifacts rather than silently normalizing them away. Each question item references a parent scenario and includes a prompt, response schema, solver-derived gold answer, and programmatic grading rule. The release is accompanied by schema validation, integrity metadata, archived configuration files, and external pandapower cross-validation, supporting transparent reuse, archival deposition, and reproducible evaluation of power-system agents and structured reasoning systems."
提供机构:
IEEE DataPort
创建时间:
2026-03-18



