omega-compositional
收藏魔搭社区2025-12-05 更新2025-06-28 收录
下载链接:
https://modelscope.cn/datasets/allenai/omega-compositional
下载链接
链接失效反馈官方服务:
资源简介:
# Compositional Math Problems
This dataset combines all compositional mathematical problem settings in paper "OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization" with proper train/test splits. Each compositional setting includes training data from individual mathematical domains and test data consisting of compositional problems that require cross-domain reasoning.
## Quick Start
```python
from datasets import load_dataset
# Load all compositional settings
dataset = load_dataset("allenai/omega-compositional")
# Load a specific compositional setting with train/test splits
circles_data = load_dataset("allenai/omega-compositional", "comp_circles_algebra")
train_data = circles_data["train"] # Training problems from individual domains
test_data = circles_data["test"] # Compositional problems requiring both domains
# Load just the training or test split
train_only = load_dataset("allenai/omega-compositional", "comp_circles_algebra", split="train")
test_only = load_dataset("allenai/omega-compositional", "comp_circles_algebra", split="test")
```
## Dataset Description
Each compositional setting combines training data from two distinct mathematical domains and provides compositional test problems that require integrating knowledge from both domains. The training problems are sourced from individual domain datasets, while test problems are specifically designed to test cross-domain reasoning capabilities.
## Citation
If you use this dataset, please cite the original work:
```bibtex
@article{sun2024omega,
title = {OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization},
author = {Yiyou Sun and Shawn Hu and Georgia Zhou and Ken Zheng and Hannaneh Hajishirzi and Nouha Dziri and Dawn Song},
journal = {arXiv preprint arXiv:2506.18880},
year = {2024},
}
```
## Related Resources
- **Explorative Dataset**: See [omega-explorative](https://huggingface.co/datasets/sunyiyou/omega-explorative) for explorative reasoning challenges
- **Transformative Dataset**: See [omega-transformative](https://huggingface.co/datasets/sunyiyou/omega-transformative) for transformative reasoning challenges
- **Paper**: See the full details in [paper](https://arxiv.org/pdf/2506.18880)
- **Code Repository**: See generation code on [github](https://github.com/sunblaze-ucb/math_ood)
# 组合式数学问题数据集
本数据集整合了论文《OMEGA:大语言模型(LLM)能否在数学领域跳出固有思维?评估探索式、组合式与变革式泛化能力》中的所有组合式数学问题场景,并配备了规范的训练集/测试集划分。每个组合式问题场景均包含来自单一数学领域的训练数据,以及由需要跨领域推理的组合式问题构成的测试数据。
## 快速入门
python
from datasets import load_dataset
# 加载所有组合式问题场景
dataset = load_dataset("allenai/omega-compositional")
# 加载带有训练/测试集划分的特定组合式问题场景
circles_data = load_dataset("allenai/omega-compositional", "comp_circles_algebra")
train_data = circles_data["train"] # 来自单一领域的训练问题
test_data = circles_data["test"] # 需要结合双领域知识的组合式问题
# 仅加载训练集或测试集划分
train_only = load_dataset("allenai/omega-compositional", "comp_circles_algebra", split="train")
test_only = load_dataset("allenai/omega-compositional", "comp_circles_algebra", split="test")
## 数据集说明
每个组合式问题场景均结合了两个不同数学领域的训练数据,并提供需要整合双领域知识的组合式测试问题。训练问题源自单一领域的数据集,而测试问题则专为评估跨领域推理能力而设计。
## 引用格式
若使用本数据集,请引用下述原始文献:
bibtex
@article{sun2024omega,
title = {OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization},
author = {Yiyou Sun and Shawn Hu and Georgia Zhou and Ken Zheng and Hannaneh Hajishirzi and Nouha Dziri and Dawn Song},
journal = {arXiv preprint arXiv:2506.18880},
year = {2024},
}
## 相关资源
- **探索式数据集**:如需探索式推理挑战数据集,请访问 [omega-explorative](https://huggingface.co/datasets/sunyiyou/omega-explorative)
- **变革式数据集**:如需变革式推理挑战数据集,请访问 [omega-transformative](https://huggingface.co/datasets/sunyiyou/omega-transformative)
- **研究论文**:完整细节请参阅 [论文](https://arxiv.org/pdf/2506.18880)
- **代码仓库**:数据集生成代码请访问 [GitHub](https://github.com/sunblaze-ucb/math_ood)
提供机构:
maas
创建时间:
2025-06-25



