five

llm-semantic-router/e2e-halspans

收藏
Hugging Face2026-01-10 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/llm-semantic-router/e2e-halspans
下载链接
链接失效反馈
官方服务:
资源简介:
E2E幻觉跨度数据集是一个合成的幻觉检测数据集,源自E2E NLG挑战赛的餐厅数据。包含1,500个样本,带有LLM生成的响应和跨度级别的幻觉注释。该数据集旨在增强RAGTruth在Data2txt(结构化数据到文本)任务中的覆盖范围。LLM从E2E的意义表示中生成忠实和故意幻觉的餐厅描述,然后注释幻觉跨度。数据集包含不同类型的幻觉,如明显冲突、明显无根据信息、微妙无根据信息和微妙冲突。数据格式为RAGTruth兼容的JSON格式,包含提示、回答、标签等字段。数据集用于训练、评估和研究LLM在结构化数据生成中的幻觉模式。

The E2E Hallucination Spans Dataset is a synthetic hallucination detection dataset derived from the E2E NLG Challenge restaurant data. It contains 1,500 samples with LLM-generated responses and span-level hallucination annotations. The dataset was created to augment RAGTruth for Data2txt (structured data to text) task coverage. An LLM generates both faithful and intentionally hallucinated restaurant descriptions from E2Es meaning representations, then annotates the hallucinated spans. The dataset includes different types of hallucinations such as Evident Conflict, Evident Baseless Info, Subtle Baseless Info, and Subtle Conflict. The data format is RAGTruth-compatible JSON, including fields like prompt, answer, labels, etc. The dataset is intended for training, evaluation, and research on LLM hallucination patterns in structured data generation.
提供机构:
llm-semantic-router
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作