shared-imagination

Name: shared-imagination
Creator: maas
Published: 2025-08-22 16:43:29
License: 暂无描述

魔搭社区2025-08-22 更新2025-08-16 收录

下载链接：

https://modelscope.cn/datasets/Salesforce/shared-imagination

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for Shared Imagination This dataset contains the problems used in the paper Shared ## Dataset Description This dataset contains the questions generated for the investigations described in the TMLR paper [Shared Imagination: LLMs Hallucinate Alike](https://arxiv.org/pdf/2407.16604). If you want to use this dataset to assess new models, please use the `default` config (i.e., `datasets.load_dataset('Salesforce/shared-imagination')`). This config contains questions for which the four candidate choices have been shuffled and are used in most experiments in the paper. If you want to study the impact of choice shuffling, you can evaluate models on the `no_choice_shuffle` config (i.e., `datasets.load_dataset('Salesforce/shared-imagination', 'no_choice_shuffle')`). This config contains questions with candidate choices recorded in the original ordering produced by the question-generation model. Under each config, there are four splits: * direct_questions * context_questions * direct_questions_creative * context_questions_creative The first two are the MMLU-style questions used in the majority of experiments. The last two are questions about creative writing stories generated for the experiment in Sec. 3.6 of the paper. For each instance, there are the following fields: * `model`: the model that generates the question. * `category`: the category of the question, one of `['mathematics', 'computer science', 'physics', 'chemistry', 'biology', 'geography', 'sociology', 'psychology', 'economics', 'accounting', 'marketing', 'law', 'politics', 'history', 'literature', 'philosophy', 'religion']` for the MMLU-style questions, and one of `['friendship', 'family relationship', 'a childhood in poverty', 'young adulthood', 'an interpersonal conflict', 'a roadtrip', 'an ancient empire', 'a long-lasting war', 'future technology', 'an intergalactic civilization']` for the creative writing questions. * `idx`: index of the question, 0-19 for MMLU-style questions, and 0-9 for creative writing questions. * `question`: the text of the question. * `choices`: the list of four choices, already-shuffled in the `default` config, and order-preserved in the `no_choice_shuffle` config. * `label`: (0-based) index of the correct choice. * `context`: the knowledge paragraph for the MMLU-style context questions and the short story for the creative writing context questions. For direct questions, the value is 'N/A'. * `concept`: the concept for the MMLU-style context questions. For all other questions, the value is 'N/A'. - **Curated by:** Yilun Zhou - **Language:** English - **License:** MIT - **Paper:** https://arxiv.org/pdf/2407.16604 - **Website:** https://yilunzhou.github.io/shared-imagination/ - **Contact:** yilun.zhou@salesforce.com ## Citation If you use this dataset in a scholarly publication, please cite the paper ``` @article{zhou2025shared, title={Shared Imagination: LLMs Hallucinate Alike}, author={Zhou, Yilun and Xiong, Caiming and Savarese, Silvio and Wu, Chien-Sheng}, journal={Transactions on Machine Learning Research}, year={2025} } ``` ## Ethical Considerations This release is for research purposes only in support of an academic paper. Our models, datasets, and code are not specifically designed or evaluated for all downstream purposes. We strongly recommend users evaluate and address potential concerns related to accuracy, safety, and fairness before deploying this model. We encourage users to consider the common limitations of AI, comply with applicable laws, and leverage best practices when selecting use cases, particularly for high-risk scenarios where errors or misuse could significantly impact people’s lives, rights, or safety. For further guidance on use cases, refer to our [AUP](https://www.salesforce.com/en-us/wp-content/uploads/sites/4/documents/legal/Agreements/policies/ExternalFacing_Services_Policy.pdf) and [AI AUP](https://www.salesforce.com/en-us/wp-content/uploads/sites/4/documents/legal/Agreements/policies/ai-acceptable-use-policy.pdf).

# 数据集卡片：共享想象力（Shared Imagination）本数据集包含论文《Shared》中使用的相关问题 ## 数据集描述本数据集包含发表于《机器学习研究汇刊》（Transactions on Machine Learning Research，简称TMLR）论文《共享想象力：大语言模型（LLMs）的幻觉具有共性》（Shared Imagination: LLMs Hallucinate Alike，https://arxiv.org/pdf/2407.16604）中相关研究所使用的问题。若需使用本数据集评估新型模型，请使用`default`配置（即通过`datasets.load_dataset('Salesforce/shared-imagination')`加载）。该配置下的四道候选选项均经过打乱处理，为本论文多数实验所采用。若需研究选项打乱的影响，则可在`no_choice_shuffle`配置下评估模型（即通过`datasets.load_dataset('Salesforce/shared-imagination', 'no_choice_shuffle')`加载），该配置下的候选选项保留了问题生成模型最初生成的原始顺序。每个配置下均包含四个子集： * 直接问题（direct_questions） * 上下文问题（context_questions） * 创意直接问题（direct_questions_creative） * 创意上下文问题（context_questions_creative）前两个子集为多数实验中使用的MMLU风格问题，后两个子集则为本论文3.6节实验所采用的创意写作故事相关问题。对于每个数据实例，均包含以下字段： * `model`：生成该问题的模型 * `category`：问题所属类别，MMLU风格问题的可选类别为`['数学', '计算机科学', '物理学', '化学', '生物学', '地理学', '社会学', '心理学', '经济学', '会计学', '市场营销学', '法学', '政治学', '历史学', '文学', '哲学', '宗教学']`，创意写作问题的可选类别为`['友谊', '家庭关系', '贫困童年', '青年时期', '人际冲突', '公路旅行', '古代帝国', '长期战争', '未来科技', '星际文明']` * `idx`：问题索引，MMLU风格问题的索引范围为0至19，创意写作问题的索引范围为0至9 * `question`：问题文本 * `choices`：候选选项列表，`default`配置下已完成打乱，`no_choice_shuffle`配置下保留原始顺序 * `label`：正确选项的索引（以0为基准） * `context`：上下文内容，对于MMLU风格的上下文问题，为对应知识段落；对于创意写作上下文问题，为对应短篇故事；直接问题的上下文值为'N/A' * `concept`：概念字段，仅MMLU风格的上下文问题包含对应概念，其余所有问题的该字段值均为'N/A' - **编撰者：** 周奕伦（Yilun Zhou） - **语言：** 英语 - **许可证：** MIT许可证 - **论文链接：** https://arxiv.org/pdf/2407.16604 - **官方网站：** https://yilunzhou.github.io/shared-imagination/ - **联系方式：** yilun.zhou@salesforce.com ## 引用说明若您在学术出版物中使用本数据集，请引用以下论文： @article{zhou2025shared, title={Shared Imagination: LLMs Hallucinate Alike}, author={Zhou, Yilun and Xiong, Caiming and Savarese, Silvio and Wu, Chien-Sheng}, journal={Transactions on Machine Learning Research}, year={2025} } ## 伦理考量本数据集仅为支持学术论文研究而发布。我们的模型、数据集与代码并非为所有下游应用场景专门设计或评估。我们强烈建议用户在部署本模型前，对其准确性、安全性与公平性相关的潜在问题进行评估并妥善处理。我们鼓励用户考虑人工智能的普遍局限性，遵守适用法律法规，并在选择应用场景时采用最佳实践，尤其针对那些错误或滥用可能严重影响民众生命、权利或安全的高风险场景。如需进一步的应用场景使用指南，请参阅我们的[AUP（可接受使用政策）](https://www.salesforce.com/en-us/wp-content/uploads/sites/4/documents/legal/Agreements/policies/ExternalFacing_Services_Policy.pdf)与[AI可接受使用政策（AI AUP）](https://www.salesforce.com/en-us/wp-content/uploads/sites/4/documents/legal/Agreements/policies/ai-acceptable-use-policy.pdf)。

提供机构：

maas

创建时间：

2025-08-15

5,000+

优质数据集

54 个

任务类型

进入经典数据集