five

bertybaums/marc2

收藏
Hugging Face2026-03-26 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/bertybaums/marc2
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - text-generation - visual-question-answering language: - en tags: - arc-agi - abstract-reasoning - metaphor - figurative-language - marc pretty_name: "MARC2: Metaphor Abstraction and Reasoning Corpus v2" size_categories: - 1K<n<10K --- # MARC2: Metaphor Abstraction and Reasoning Corpus v2 MARC2 extends the [MARC-from-LARC](https://huggingface.co/datasets/bertybaums/marc-from-larc) methodology to the [ARC-AGI2](https://github.com/arcprize/ARC-AGI-2) dataset. It provides a corpus of figurative language puzzles where metaphorical descriptions help AI models solve abstract reasoning tasks they cannot solve from examples alone. ## The MARC Property A task has the **MARC property** (for a given model) when: 1. **Examples alone fail** — the model cannot solve the task from input/output examples 2. **Figurative description alone fails** — the metaphor is too ambiguous without examples 3. **Figurative + examples succeeds** — the metaphor triggers an "aha" moment when combined with examples ## Pipeline | Phase | Result | |-------|--------| | Claude Opus 4.6 solves ARC-AGI2 training tasks | 865/1000 (86.5%) | | Distill reasoning into language-complete descriptions | 865/865 (100%) | | Validate descriptions (fresh solver, no examples) | 791/865 (91.4%) | | Baseline testing on gpt-oss-120b (3 conditions) | 2,373 trials | | Task classification | 350 MARC-eligible | | Generate figurative descriptions | 350 original + 1,560 alternatives | | MARC verification | **104 puzzles, 824 MARC-valid clues** | ## Dataset Configs | Config | Rows | Description | |--------|------|-------------| | `tasks` | 1,120 | ARC-AGI2 task metadata | | `solve_trials` | 1,007 | Claude's solving attempts with reasoning traces | | `descriptions` | 865 | Language-complete see/do/grid descriptions | | `task_subsets` | 791 | Per-model task classification | | `figurative_descriptions` | 1,910 | Figurative clues (original + 15 domain alternatives) | | `baseline_trials` | 2,373 | Subject model baseline results | | `figurative_trials` | 8,645 | Subject model figurative trial results | ## Source Domains (15) biology, cooking, music, sports, weather, architecture, warfare, theater, gardening, astronomy, ocean/sailing, electronics, mythology, dance, geology ## Key Findings - **Language descriptions dramatically outperform examples**: 58.2% vs 25.8% accuracy for gpt-oss-120b - **Opacity-guided metaphor generation** improved MARC yield from 29.7% to 47.6% - **824 MARC-valid figurative clues** across 104 puzzles and 15 source domains - Average 6.9 MARC-valid variants per puzzle ## Links - **Code**: [github.com/bertybaums/marc2](https://github.com/bertybaums/marc2) - **Parent project**: [MARC-from-LARC](https://github.com/bertybaums/marc-from-larc) - **ARC-AGI2**: [github.com/arcprize/ARC-AGI-2](https://github.com/arcprize/ARC-AGI-2) ## Citation ```bibtex @dataset{baum2026marc2, title={MARC2: Metaphor Abstraction and Reasoning Corpus v2}, author={Baum, Bert}, year={2026}, url={https://huggingface.co/datasets/bertybaums/marc2}, doi={10.5281/zenodo.19241782} } ``` ## Date March 26, 2026
提供机构:
bertybaums
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作