five

interfaze-ai/sob

收藏
Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/interfaze-ai/sob
下载链接
链接失效反馈
官方服务:
资源简介:
结构化输出基准(SOB)是一个多源基准数据集,用于评估大型语言模型(LLMs)从非结构化或半结构化上下文中生成符合模式且值正确的JSON的准确性。数据集涵盖三种源模态:文本(来自HotpotQA的多跳问答)、图像(来自olmOCR-bench的PDF OCR提取标记)和音频(来自AMI会议语料库的发言人标记转录)。所有模态都经过文本归一化处理,以隔离结构化输出能力与原始视觉/ASR处理质量。SOB不仅测量模式合规性(“JSON是否有效?”),还测量JSON内部值的正确性,揭示了在不同源模态下准确性的变化。数据集包含训练、验证和测试分割,并提供了详细的模式复杂性、上下文长度分布和记录统计信息。数据集发布在MIT许可证下。

The Structured Output Benchmark (SOB) is a multi-source benchmark for evaluating how accurately large language models (LLMs) produce schema-compliant and value-correct JSON from unstructured or semi-structured context. It covers three source modalities: text (from HotpotQA multi-hop QA), image (from olmOCR-bench PDFs via OCR-extracted markdown), and audio (from AMI Meeting Corpus speaker-labelled transcripts). All modalities are text-normalized to isolate structured-output capability from raw vision/ASR processing quality. SOB measures not just schema compliance ("is the JSON valid?") but also the correctness of values inside the JSON, exposing how accuracy shifts across source modalities. The dataset includes train, validation, and test splits, with detailed schema complexity, context-length profiles, and record statistics. Released under the MIT License.
提供机构:
interfaze-ai
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作