sgans/CleanSmall
收藏CLEAN 数据集概述
数据集简介
CLEAN 是一个用于研究大型语言模型(LLMs)在缺乏精确数值信息的情况下如何回答问题的数据集。该数据集要求 LLMs 利用提供的上下文进行合理的推测,并为每个问题提供一个可接受的现实范围。数据集涵盖了多个类别,如体育、音乐、历史、游戏等。
数据集规模
这是一个小型版本的数据集,仅包含 100 个问题。设计用于低成本测试当前 LLMs 处理这类问题的能力。
LLM 结果示例
LLAMA2 70B 错误示例
问题:
As the citys elite gathered, the grand opening of La Table Étoilée, the new French restaurant, was the talk of the town. The chefs, flown in from Paris, bustled in the kitchen, their expertise evident in the delicate balance of flavors on each plate. The eyes of critics shone with anticipation, cutlery poised over what promised to be a symphony of taste.
The sommelier navigated the intricacies of the wine list, recommending perfect pairings for the rich and complex dishes being served. Waiters glided between tables, the clinking of fine crystal and china setting the rhythm of an unforgettable night. La Table Étoilée wasnt just serving dinner; it was hosting an experience, a dance of cuisine and culture.
As the night dwindled, the patron of the evening, a connoisseur of the culinary arts, left a generous tip, his expression one of satisfaction and subtle delight. He knew the staff had gone to great lengths to ensure the evening was nothing short of perfection.
What was the value of the connoisseurs tip?
LLAMA2 70B 答案:
25
真实答案:
[100, 1000]
GPT4 TURBO 错误示例
问题:
In the mystical realm of Eldoria, Aric the Swift navigated treacherous terrain and vanquished foes with uncanny agility. His eyes, ever-fixed on the horizon, sought the legendary Crystal of Tarkus, rumored to lie within the heart of the Forsaken Mountains.
Banding together with Miara the Mage and Loric the Stout, Aric ventured deeper into the maw of unknown lands. Together, they faced mythical beasts and deciphered ancient riddles, all for a glimpse of the Crystals radiant gleam.
Finally, after enduring trials that would break lesser warriors, Arics fellowship beheld the Crystal of Tarkus, pulsing with an ethereal light. With reverence, they received its power, forever altering the fates of those in Eldoria and beyond.
How many mythical beasts did the trio encounter?
GPT4 TURBO 答案:
4
真实答案:
[10, 50]
未来工作
- 改进 LLMs 的指令,以便更详细地研究更广泛的 LLMs。
- 寻找能够从 Mixtral8x7B 中提取正确答案的指令。
- 增加数据集的规模,以创建用于微调的训练集。



