five

K-and-K/perturbed-knights-and-knaves

收藏
Hugging Face2024-10-31 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/K-and-K/perturbed-knights-and-knaves
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-sa-4.0 task_categories: - question-answering language: - en configs: - config_name: train data_files: - split: perturbed_leaf path: - train/perturbed_leaf/people2_num200.jsonl - train/perturbed_leaf/people3_num1000.jsonl - train/perturbed_leaf/people4_num1000.jsonl - train/perturbed_leaf/people5_num1000.jsonl - train/perturbed_leaf/people6_num1000.jsonl - train/perturbed_leaf/people7_num1000.jsonl - train/perturbed_leaf/people8_num1000.jsonl - split: perturbed_statement path: - train/perturbed_statement/people2_num200.jsonl - train/perturbed_statement/people3_num1000.jsonl - train/perturbed_statement/people4_num1000.jsonl - train/perturbed_statement/people5_num1000.jsonl - train/perturbed_statement/people6_num1000.jsonl - train/perturbed_statement/people7_num1000.jsonl - train/perturbed_statement/people8_num1000.jsonl - split: reorder_statement path: - train/reorder_statement/people2_num200.jsonl - train/reorder_statement/people3_num1000.jsonl - train/reorder_statement/people4_num1000.jsonl - train/reorder_statement/people5_num1000.jsonl - train/reorder_statement/people6_num1000.jsonl - train/reorder_statement/people7_num1000.jsonl - train/reorder_statement/people8_num1000.jsonl - split: random_pair path: - train/random_pair/people2_num200.jsonl - train/random_pair/people3_num1000.jsonl - train/random_pair/people4_num1000.jsonl - train/random_pair/people5_num1000.jsonl - train/random_pair/people6_num1000.jsonl - train/random_pair/people7_num1000.jsonl - train/random_pair/people8_num1000.jsonl - split: uncommon_name path: - train/uncommon_name/people2_num200.jsonl - train/uncommon_name/people3_num1000.jsonl - train/uncommon_name/people4_num1000.jsonl - train/uncommon_name/people5_num1000.jsonl - train/uncommon_name/people6_num1000.jsonl - train/uncommon_name/people7_num1000.jsonl - train/uncommon_name/people8_num1000.jsonl - split: flip_role path: - train/flip_role/people2_num200.jsonl - train/flip_role/people3_num1000.jsonl - train/flip_role/people4_num1000.jsonl - train/flip_role/people5_num1000.jsonl - train/flip_role/people6_num1000.jsonl - train/flip_role/people7_num1000.jsonl - train/flip_role/people8_num1000.jsonl - config_name: test data_files: - split: perturbed_leaf path: - test/perturbed_leaf/people2_num100.jsonl - test/perturbed_leaf/people3_num100.jsonl - test/perturbed_leaf/people4_num100.jsonl - test/perturbed_leaf/people5_num100.jsonl - test/perturbed_leaf/people6_num100.jsonl - test/perturbed_leaf/people7_num100.jsonl - test/perturbed_leaf/people8_num100.jsonl - split: perturbed_statement path: - test/perturbed_statement/people2_num100.jsonl - test/perturbed_statement/people3_num100.jsonl - test/perturbed_statement/people4_num100.jsonl - test/perturbed_statement/people5_num100.jsonl - test/perturbed_statement/people6_num100.jsonl - test/perturbed_statement/people7_num100.jsonl - test/perturbed_statement/people8_num100.jsonl - split: reorder_statement path: - test/reorder_statement/people2_num100.jsonl - test/reorder_statement/people3_num100.jsonl - test/reorder_statement/people4_num100.jsonl - test/reorder_statement/people5_num100.jsonl - test/reorder_statement/people6_num100.jsonl - test/reorder_statement/people7_num100.jsonl - test/reorder_statement/people8_num100.jsonl - split: random_pair path: - test/random_pair/people2_num100.jsonl - test/random_pair/people3_num100.jsonl - test/random_pair/people4_num100.jsonl - test/random_pair/people5_num100.jsonl - test/random_pair/people6_num100.jsonl - test/random_pair/people7_num100.jsonl - test/random_pair/people8_num100.jsonl - split: uncommon_name path: - test/uncommon_name/people2_num100.jsonl - test/uncommon_name/people3_num100.jsonl - test/uncommon_name/people4_num100.jsonl - test/uncommon_name/people5_num100.jsonl - test/uncommon_name/people6_num100.jsonl - test/uncommon_name/people7_num100.jsonl - test/uncommon_name/people8_num100.jsonl - split: flip_role path: - test/flip_role/people2_num100.jsonl - test/flip_role/people3_num100.jsonl - test/flip_role/people4_num100.jsonl - test/flip_role/people5_num100.jsonl - test/flip_role/people6_num100.jsonl - test/flip_role/people7_num100.jsonl - test/flip_role/people8_num100.jsonl tags: - logical - reasoning size_categories: - 1K<n<10K --- # 📘 perturbed-knights-and-knaves Dataset [[Project Page]](https://memkklogic.github.io/) The **perturbed-knights-and-knaves** dataset evaluates the consistency of LLMs' logical reasoning ability under various perturbations. **🚀🚀 Check out the clean version of the dataset at [[knights-and-knaves]](https://huggingface.co/datasets/K-and-K/knights-and-knaves).** ## Loading the dataset To load the dataset: ```python from datasets import load_dataset data_subject = datasets.load_dataset('K-and-K/perturbed-knights-and-knaves', data_files="{subset}/{perturbation}/{subject}.jsonl") ``` * Available subset: `test`, `train`. * Available perturbation: `perturbed_leaf`,`perturbed_statement`,`reorder_statement`,`random_pair`,`uncommon_name`,`flip_role`. * Available subject: * for `train` subset, we have `people2_num200`,`people3_num1000`, ..., `people8_num1000` * for `test` subset, we have `people2_num100`,`people3_num100`, ..., `people8_num100` ## 🛠️ Codebase To evaluate LLMs on our datasets, visit our [GitHub repository](https://github.com/AlphaPav/mem-kk-logic/). ## ⭐ Citing our Work If you find our codebase and datasets beneficial, kindly cite our work: ```bibtex @article{xie2024memorization, title={On Memorization of Large Language Models in Logical Reasoning}, author={Chulin Xie and Yangsibo Huang and Chiyuan Zhang and Da Yu and Xinyun Chen and Bill Yuchen Lin and Bo Li and Badih Ghazi and Ravi Kumar}, year={2024}, eprint={2410.23123}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2410.23123}, } ```
提供机构:
K-and-K
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作