five

facebook/gistbench

收藏
Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/facebook/gistbench
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-4.0 task_categories: - text-classification language: - en size_categories: - 1M<n<10M --- # GISTBench — Groundedness & Interest Specificity Test Bench GISTBench evaluates how well LLMs understand users from their engagement history. Given a user's interactions with content (videos, articles, books, etc.), the benchmark measures whether an LLM can extract meaningful interests, ground them in evidence, and cite specific relevant items. ## Dataset Details - **Rows:** 4,214,059 engagements - **Users:** ~1,000 anonymized users - **Format:** Parquet ### Schema | Column | Type | Description | |---|---|---| | `user_id` | `int64` | Anonymized user identifier | | `object_id` | `int64` | Anonymized content item identifier | | `object_text` | `string` | Text description of the content item | | `interaction_type` | `string` | One of: `explicit_positive`, `implicit_positive`, `implicit_negative`, `explicit_negative` | | `interaction_time` | `string` | Anonymized interaction timestamp | ### Interaction Types | Type | Meaning | Examples | |---|---|---| | `explicit_positive` | User actively expressed positive interest | Liked, favorited, rated highly | | `implicit_positive` | Passive positive signal | Watched fully, clicked through | | `implicit_negative` | Passive negative signal | Scrolled past, skipped | | `explicit_negative` | User actively expressed dislike | Downvoted, reported, rated low | ## Usage ```python from datasets import load_dataset ds = load_dataset("facebook/gistbench") print(ds) # DatasetDict({ # train: Dataset({ # features: ['interaction_type', 'user_id', 'object_id', 'interaction_time', 'object_text'], # num_rows: 4214059 # }) # }) # Access rows print(ds["train"][0]) ``` ## Citation If you use GISTBench in your research, please cite: ```bibtex @misc{fostiropoulos2026gistbench, title={GISTBench: Evaluating LLM User Understanding via Evidence-Based Interest Verification}, author={Iordanis Fostiropoulos and Muhammad Rafay Azhar and Abdalaziz Sawwan and Boyu Fang and Yuchen Liu and Jiayi Liu and Hanchao Yu and Qi Guo and Jianyu Wang and Fei Liu and Xiangjun Fan}, year={2026}, eprint={2603.29112}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2603.29112} } ``` ## License This dataset is released under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).
提供机构:
facebook
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作