five

rsbench

收藏
arXiv2024-06-15 更新2024-06-19 收录
下载链接:
https://unitn-sml.github.io/rsbench
下载链接
链接失效反馈
官方服务:
资源简介:
rsbench是由特伦托大学开发的综合性基准套件,旨在系统评估模型在学习和推理任务中对推理捷径(RSs)的影响。该数据集包含多种类型的任务,如算术、逻辑和高风险任务,并提供数据生成器以评估分布外场景。rsbench不仅支持神经网络模型,还包括神经符号模型,通过实施质量评估指标和新的形式验证程序,帮助研究者理解和改进模型在复杂任务中的表现。该数据集适用于解决模型在学习和推理过程中可能出现的推理捷径问题,特别是在高风险应用中,如自动驾驶决策。

rsbench is a comprehensive benchmark suite developed by the University of Trento, which aims to systematically evaluate the impact of reasoning shortcuts (RSs) on models during learning and inference tasks. This dataset includes a wide range of task types, such as arithmetic, logical, and high-stakes tasks, and offers data generators to assess out-of-distribution scenarios. rsbench supports not only neural network models but also neurosymbolic models. By implementing quality evaluation metrics and novel formal verification procedures, it helps researchers understand and improve model performance on complex tasks. This dataset is designed to address the issue of reasoning shortcuts that may emerge during model learning and inference processes, especially in high-stakes applications such as autonomous driving decision-making.
提供机构:
特伦托大学
创建时间:
2024-06-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作