five

Time referenced List based Question Answering (TLQA)

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/elixir-research-group/TLQA
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是TLQA基准,它要求回答以列表形式呈现,并与相应的时间段对齐,以此来评估大型语言模型在列表构建和时间理解方面的能力。该基准涵盖了黄金证据、闭卷和开放领域三种评估设置,重点关注不同大型语言模型的表现。任务旨在评估大型语言模型在时间理解以及列表构建方面的能力。

This dataset is the TLQA benchmark, which requires answers to be presented in list format and aligned with their corresponding time periods, aiming to evaluate the capabilities of large language models (LLMs) in list construction and temporal understanding. The benchmark covers three evaluation settings: gold evidence, closed-book, and open-domain, with a focus on the performance of various large language models. The task aims to assess the temporal understanding and list construction abilities of large language models.
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作