interfaze-ai/sob

Name: interfaze-ai/sob
Creator: interfaze-ai
Published: 2026-04-28 16:49:54
License: 暂无描述

Hugging Face2026-04-28 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/interfaze-ai/sob

下载链接

链接失效反馈

官方服务：

资源简介：

结构化输出基准（SOB）是一个多源基准数据集，用于评估大型语言模型（LLMs）从非结构化或半结构化上下文中生成符合模式且值正确的JSON的准确性。数据集涵盖三种源模态：文本（来自HotpotQA的多跳问答）、图像（来自olmOCR-bench的PDF OCR提取标记）和音频（来自AMI会议语料库的发言人标记转录）。所有模态都经过文本归一化处理，以隔离结构化输出能力与原始视觉/ASR处理质量。SOB不仅测量模式合规性（“JSON是否有效？”），还测量JSON内部值的正确性，揭示了在不同源模态下准确性的变化。数据集包含训练、验证和测试分割，并提供了详细的模式复杂性、上下文长度分布和记录统计信息。数据集发布在MIT许可证下。

The Structured Output Benchmark (SOB) is a multi-source benchmark for evaluating how accurately large language models (LLMs) produce schema-compliant and value-correct JSON from unstructured or semi-structured context. It covers three source modalities: text (from HotpotQA multi-hop QA), image (from olmOCR-bench PDFs via OCR-extracted markdown), and audio (from AMI Meeting Corpus speaker-labelled transcripts). All modalities are text-normalized to isolate structured-output capability from raw vision/ASR processing quality. SOB measures not just schema compliance ("is the JSON valid?") but also the correctness of values inside the JSON, exposing how accuracy shifts across source modalities. The dataset includes train, validation, and test splits, with detailed schema complexity, context-length profiles, and record statistics. Released under the MIT License.

提供机构：

interfaze-ai

5,000+

优质数据集

54 个

任务类型

进入经典数据集