five

Nondegeneracy/LLM-Susceptibility-theory

收藏
Hugging Face2026-03-19 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Nondegeneracy/LLM-Susceptibility-theory
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - text-generation language: - en tags: - LLM - agent - scaling-laws - susceptibility pretty_name: LLM Information Susceptibility Experimental Data size_categories: - 10K<n<100K --- # LLM Information Susceptibility -- Experimental Data Experimental data for the paper *"A Theory of LLM Information Susceptibility"* by Zhuo-Yang Song. ## Dataset Structure ``` . ├── results/ # Domain experiment results (25 files) │ ├── dfs_baseline.json # Tetris DFS baseline (6 beam widths × 3 rewards × 40 seeds) │ ├── llm_qwen-{7b,14b,32b,72b,3-max}_aggressive.json # Tetris LLM (5 models) │ ├── llm_qwen-32b_conservative.json # Tetris reward variant │ ├── llm_qwen-32b_default.json # Tetris reward variant │ ├── tetris_prompt_{minimal,cot,expert}.json # Tetris prompt variants (3 files) │ ├── knapsack_v2_dfs_baseline.json # Knapsack baseline │ ├── knapsack_v2_llm_qwen-{7b,14b,32b,72b,3-max}_standard.json # Knapsack LLM (5 models) │ ├── ranking_dfs_baseline.json # Ranking baseline │ └── ranking_llm_qwen-{7b,14b,32b,72b,3-max}_standard.json # Ranking LLM (5 models) │ ├── scaling_results/ # AIME scaling results (2 files) │ ├── phase3_results.json # var+var (generator = selector) │ └── phase3_lechatelier.json # var+const (all generator × selector combinations) │ └── aime_problems_cache.json # AIME 2024+2025 problem texts (60 problems) ``` ## Data Schemas ### Tetris (`dfs_baseline.json`, `llm_*.json`) | Field | Type | Description | |-------|------|-------------| | `agent_type` | str | `"dfs"` or `"llm"` | | `model` | str | Model name or `"none"` | | `seed` | int | Random seed | | `beam_width` | int | $\mathcal{B} \in \{1,2,4,8,16,32\}$ | | `reward_fn` | str | `"aggressive"` / `"conservative"` / `"default"` | | `lines_cleared` | int | Performance $J$ | ### Knapsack (`knapsack_v2_*.json`) | Field | Type | Description | |-------|------|-------------| | `agent_type` | str | `"dfs"` or `"llm"` | | `model` | str | Model name | | `seed` | int | Problem instance | | `beam_width` | int | $\mathcal{B} \in \{1,2,4,8,16,32,64\}$ | | `total_value` | int | Performance $J$ | ### Ranking (`ranking_*.json`) | Field | Type | Description | |-------|------|-------------| | `agent_type` | str | `"dfs"` or `"llm"` | | `model` | str | Model name | | `snr` | int | $\mathcal{B} \in \{1,2,4,8,16,32,64,128\}$ | | `correct` | bool | Whether rank-1 item identified correctly | ### AIME var+var (`phase3_results.json`) | Field | Type | Description | |-------|------|-------------| | `method` | str | `"majority_vote"` or `"agent"` | | `model` | str | Model (same for gen & sel) | | `problem_type` | str | `"aime_2024"` or `"aime_2025"` | | `k` | int | Sample count $\in \{1,3,5,9,15,17,19,21\}$ | | `correct` | bool | Whether answer is correct | ### AIME var+const (`phase3_lechatelier.json`) Same as above, plus `generator`, `selector`, `config` fields. ## Figure → Data Mapping | Figure | Data files | |--------|-----------| | Fig. 1 (Tetris) | `dfs_baseline.json`, `llm_qwen-*_aggressive.json` | | Fig. 2 (Robustness) | `dfs_baseline.json`, `tetris_prompt_*.json`, `llm_qwen-32b_{aggressive,conservative,default}.json` | | Fig. 3 (Cross-domain) | `dfs_baseline.json`, `llm_*_aggressive.json`, `knapsack_v2_*.json`, `ranking_*.json`, `phase3_*.json` | | Fig. 4 ($\alpha$ + coupling) | `phase3_results.json`, `phase3_lechatelier.json` | | Fig. 5 (Nested vs fixed) | `phase3_results.json`, `phase3_lechatelier.json` | ## Models | Key | Full name | Parameters | |-----|-----------|-----------| | `qwen-7b` | Qwen-2.5-7B-Instruct | 7B | | `qwen-14b` | Qwen-2.5-14B-Instruct | 14B | | `qwen-32b` | Qwen-2.5-32B-Instruct | 32B | | `qwen-72b` | Qwen-2.5-72B-Instruct | 72B | | `qwen3-max` | Qwen3-Max | ~236B | ## Citation ```bibtex @article{song2025susceptibility, title={A Theory of LLM Information Susceptibility}, author={Song, Zhuo-Yang}, year={2025} } ```
提供机构:
Nondegeneracy
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作