five

K-and-K/knights-and-knaves

收藏
Hugging Face2024-10-31 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/K-and-K/knights-and-knaves
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-sa-4.0 task_categories: - question-answering language: - en configs: - config_name: train data_files: - split: 2ppl path: - train/people2_num200.jsonl - split: 3ppl path: - train/people3_num1000.jsonl - split: 4ppl path: - train/people4_num1000.jsonl - split: 5ppl path: - train/people5_num1000.jsonl - split: 6ppl path: - train/people6_num1000.jsonl - split: 7ppl path: - train/people7_num1000.jsonl - split: 8ppl path: - train/people8_num1000.jsonl - config_name: test data_files: - split: 2ppl path: - test/people2_num100.jsonl - split: 3ppl path: - test/people3_num100.jsonl - split: 4ppl path: - test/people4_num100.jsonl - split: 5ppl path: - test/people5_num100.jsonl - split: 6ppl path: - test/people6_num100.jsonl - split: 7ppl path: - test/people7_num100.jsonl - split: 8ppl path: - test/people8_num100.jsonl tags: - logical - reasoning pretty_name: K size_categories: - 1K<n<10K --- # 📘 knights-and-knaves Dataset [[Project Page]](https://memkklogic.github.io/) The **knights-and-knaves dataset** serves as a logical reasoning benchmark to evaluate the reasoning capabilities of LLMs. **🚀🚀 Check out the [perturbed knights-and-knaves dataset](https://huggingface.co/datasets/K-and-K/perturbed-knights-and-knaves) to evaluate the memorization of LLMs in reasoning.** ## Loading the dataset To load the dataset: ```python from datasets import load_dataset data_subject = load_dataset('K-and-K/knights-and-knaves','test',split="2ppl") ``` * Available subset: `test`, `train`. * Available split: `2ppl`,`3ppl`,`4ppl`,`5ppl`,`6ppl`,`7ppl`,`8ppl`. ## 🛠️ Codebase To evaluate LLMs on our datasets, visit our [GitHub repository](https://github.com/AlphaPav/mem-kk-logic/). ## ⭐ Citing our Work If you find our codebase and datasets beneficial, kindly cite our work: ```bibtex @article{xie2024memorization, title={On Memorization of Large Language Models in Logical Reasoning}, author={Chulin Xie and Yangsibo Huang and Chiyuan Zhang and Da Yu and Xinyun Chen and Bill Yuchen Lin and Bo Li and Badih Ghazi and Ravi Kumar}, year={2024}, eprint={2410.23123}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2410.23123}, } ```
提供机构:
K-and-K
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作