five

IOI Experiment Dataset

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/frankaging/pyvene/blob/main/tutorials/advanced_tutorials/IOI_with_DAS.ipynb
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集旨在评估模型在识别句子中与名字位置相关的“幻觉”方面的表现,特别是关注模型在回忆与主语-关系-宾语三元组相关的事实方面的能力。此外,该数据集还涉及实验,通过衡量诸如交换干预准确性(IIA)和分数对数差异减少(FLDD)等指标,来评估在干预情况下模型的行为。该数据集的任务包括句子补全和事实回忆。

This dataset is designed to evaluate models' performance in identifying "hallucinations" related to the positions of names within sentences, with a particular focus on models' ability to recall facts associated with subject-relation-object triples. Additionally, this dataset includes experiments to evaluate model behaviors under intervention scenarios by measuring metrics such as Swap Intervention Accuracy (IIA) and Fractional Log Difference Reduction (FLDD). The tasks covered by this dataset are sentence completion and fact recall.
提供机构:
Publicly released codebase by Makelov et al.
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作