five

openeurollm/orca-agentinstruct-1M-v1-decontaminated

收藏
Hugging Face2026-03-29 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/openeurollm/orca-agentinstruct-1M-v1-decontaminated
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: cdla-permissive-2.0 size_categories: - 1M<n<10M task_categories: - question-answering dataset_info: features: - name: messages list: - name: content dtype: string - name: role dtype: string splits: - name: analytical_reasoning num_bytes: 98028645 num_examples: 24983 - name: brain_teaser num_bytes: 119629503 num_examples: 49962 - name: code_ num_bytes: 589940236 num_examples: 99931 - name: creative_content num_bytes: 280573464 num_examples: 49961 - name: fermi num_bytes: 75519245 num_examples: 24984 - name: follow_up num_bytes: 849785042 num_examples: 98947 - name: fs_cot_flow num_bytes: 103809090 num_examples: 24883 - name: mcq num_bytes: 220938985 num_examples: 99873 - name: open_domain_qa num_bytes: 588970010 num_examples: 271991 - name: rag num_bytes: 409622336 num_examples: 49867 - name: rc num_bytes: 274752766 num_examples: 49910 - name: struct2text_flow num_bytes: 240292423 num_examples: 49955 - name: text_classification num_bytes: 145309572 num_examples: 49994 - name: text_extraction num_bytes: 297532567 num_examples: 49887 - name: text_modification num_bytes: 334112924 num_examples: 49919 download_size: 4628816808 dataset_size: 4628816808 configs: - config_name: default data_files: - split: analytical_reasoning path: data/analytical_reasoning-* - split: brain_teaser path: data/brain_teaser-* - split: code_ path: data/code_-* - split: creative_content path: data/creative_content-* - split: fermi path: data/fermi-* - split: follow_up path: data/follow_up-* - split: fs_cot_flow path: data/fs_cot_flow-* - split: mcq path: data/mcq-* - split: open_domain_qa path: data/open_domain_qa-* - split: rag path: data/rag-* - split: rc path: data/rc-* - split: struct2text_flow path: data/struct2text_flow-* - split: text_classification path: data/text_classification-* - split: text_extraction path: data/text_extraction-* - split: text_modification path: data/text_modification-* decontamination: source_dataset: microsoft/orca-agentinstruct-1M-v1 benchmarks: - path: HuggingFaceH4/MATH-500 subset: default split: test - path: HuggingFaceH4/aime_2024 subset: default split: train - path: math-ai/aime25 subset: default split: test - path: math-ai/amc23 subset: default split: test - path: daman1209arora/jeebench subset: default split: test - path: Idavidrein/gpqa subset: gpqa_diamond split: train - path: ali-elganzory/livecodebench-code_generation_lite subset: release_v6 split: test - path: openai/openai_humaneval subset: openai_humaneval split: test - path: google-research-datasets/mbpp subset: full split: train+test+validation+prompt - path: google/IFEval subset: default split: train - path: tatsu-lab/alpaca_eval subset: alpaca_eval split: eval - path: lmarena-ai/arena-hard-auto subset: default split: train contamination_stats: - subset: default split: creative_content total: 50000 removed: 39 - subset: default split: text_modification total: 50000 removed: 81 - subset: default split: struct2text_flow total: 50000 removed: 45 - subset: default split: rc total: 50000 removed: 90 - subset: default split: rag total: 50000 removed: 133 - subset: default split: text_extraction total: 50000 removed: 113 - subset: default split: mcq total: 99986 removed: 113 - subset: default split: follow_up total: 279935 removed: 107 - subset: default split: analytical_reasoning total: 25000 removed: 17 - subset: default split: fermi total: 25000 removed: 16 - subset: default split: fs_cot_flow total: 25000 removed: 117 - subset: default split: code_ total: 100000 removed: 69 - subset: default split: brain_teaser total: 50000 removed: 38 - subset: default split: text_classification total: 50000 removed: 6 - subset: default split: open_domain_qa total: 272370 removed: 379 --- ## Decontamination This dataset is a decontaminated version of [microsoft/orca-agentinstruct-1M-v1](https://huggingface.co/datasets/microsoft/orca-agentinstruct-1M-v1). ### Benchmarks used - **MATH500**: `HuggingFaceH4/MATH-500` (subset=default, split=test) - **AIME24**: `HuggingFaceH4/aime_2024` (subset=default, split=train) - **AIME25**: `math-ai/aime25` (subset=default, split=test) - **AMC23**: `math-ai/amc23` (subset=default, split=test) - **JEEBench**: `daman1209arora/jeebench` (subset=default, split=test) - **GPQADiamond**: `Idavidrein/gpqa` (subset=gpqa_diamond, split=train) - **LiveCodeBench**: `ali-elganzory/livecodebench-code_generation_lite` (subset=release_v6, split=test) - **HumanEval**: `openai/openai_humaneval` (subset=openai_humaneval, split=test) - **MBPP**: `google-research-datasets/mbpp` (subset=full, split=train+test+validation+prompt) - **IFEval**: `google/IFEval` (subset=default, split=train) - **AlpacaEval**: `tatsu-lab/alpaca_eval` (subset=alpaca_eval, split=eval) - **Arena-Hard-v2.0**: `lmarena-ai/arena-hard-auto` (subset=default, split=train) (data_files=['data/arena-hard-v2.0/question.jsonl']) ### Decontamination settings <table> <thead> <tr><th>Parameter</th><th>Value</th></tr> </thead> <tbody> <tr><td>N-gram size</td><td>8</td></tr> <tr><td>Match threshold</td><td>0.5</td></tr> </tbody> </table> ### Split and benchmark details <table> <thead> <tr> <th>Subset</th> <th>Split</th> <th>Docs in split (dataset)</th> <th>Benchmark</th> <th>Contaminated (dataset)</th> <th>Contamination rate (dataset)</th> <th>Docs (benchmark)</th> <th>Contaminated (benchmark)</th> <th>Contamination rate (benchmark)</th> </tr> </thead> <tbody> <tr> <td rowspan="180">default</td> <td rowspan="12">analytical_reasoning</td> <td rowspan="12">25,000</td> <td>MATH500</td> <td>16</td> <td>0.0640%</td> <td>500</td> <td>3</td> <td>0.6000%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AMC23</td> <td>0</td> <td>0.0000%</td> <td>40</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>JEEBench</td> <td>0</td> <td>0.0000%</td> <td>515</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>0</td> <td>0.0000%</td> <td>164</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>MBPP</td> <td>0</td> <td>0.0000%</td> <td>974</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>IFEval</td> <td>1</td> <td>0.0040%</td> <td>541</td> <td>1</td> <td>0.1848%</td> </tr> <tr> <td>AlpacaEval</td> <td>0</td> <td>0.0000%</td> <td>805</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>0</td> <td>0.0000%</td> <td>750</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td rowspan="12">brain_teaser</td> <td rowspan="12">50,000</td> <td>MATH500</td> <td>34</td> <td>0.0680%</td> <td>500</td> <td>10</td> <td>2.00%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AMC23</td> <td>1</td> <td>0.0020%</td> <td>40</td> <td>1</td> <td>2.50%</td> </tr> <tr> <td>JEEBench</td> <td>0</td> <td>0.0000%</td> <td>515</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>0</td> <td>0.0000%</td> <td>164</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>MBPP</td> <td>2</td> <td>0.0040%</td> <td>974</td> <td>1</td> <td>0.1027%</td> </tr> <tr> <td>IFEval</td> <td>0</td> <td>0.0000%</td> <td>541</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AlpacaEval</td> <td>1</td> <td>0.0020%</td> <td>805</td> <td>1</td> <td>0.1242%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>0</td> <td>0.0000%</td> <td>750</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td rowspan="12">code_</td> <td rowspan="12">100,000</td> <td>MATH500</td> <td>28</td> <td>0.0280%</td> <td>500</td> <td>5</td> <td>1.00%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AMC23</td> <td>0</td> <td>0.0000%</td> <td>40</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>JEEBench</td> <td>0</td> <td>0.0000%</td> <td>515</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>8</td> <td>0.0080%</td> <td>164</td> <td>5</td> <td>3.05%</td> </tr> <tr> <td>MBPP</td> <td>30</td> <td>0.0300%</td> <td>974</td> <td>17</td> <td>1.75%</td> </tr> <tr> <td>IFEval</td> <td>0</td> <td>0.0000%</td> <td>541</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AlpacaEval</td> <td>2</td> <td>0.0020%</td> <td>805</td> <td>2</td> <td>0.2484%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>2</td> <td>0.0020%</td> <td>750</td> <td>1</td> <td>0.1333%</td> </tr> <tr> <td rowspan="12">creative_content</td> <td rowspan="12">50,000</td> <td>MATH500</td> <td>14</td> <td>0.0280%</td> <td>500</td> <td>4</td> <td>0.8000%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AMC23</td> <td>0</td> <td>0.0000%</td> <td>40</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>JEEBench</td> <td>0</td> <td>0.0000%</td> <td>515</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>1</td> <td>0.0020%</td> <td>164</td> <td>2</td> <td>1.22%</td> </tr> <tr> <td>MBPP</td> <td>4</td> <td>0.0080%</td> <td>974</td> <td>4</td> <td>0.4107%</td> </tr> <tr> <td>IFEval</td> <td>0</td> <td>0.0000%</td> <td>541</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AlpacaEval</td> <td>20</td> <td>0.0400%</td> <td>805</td> <td>1</td> <td>0.1242%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>0</td> <td>0.0000%</td> <td>750</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td rowspan="12">fermi</td> <td rowspan="12">25,000</td> <td>MATH500</td> <td>10</td> <td>0.0400%</td> <td>500</td> <td>2</td> <td>0.4000%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AMC23</td> <td>0</td> <td>0.0000%</td> <td>40</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>JEEBench</td> <td>1</td> <td>0.0040%</td> <td>515</td> <td>1</td> <td>0.1942%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>0</td> <td>0.0000%</td> <td>164</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>MBPP</td> <td>5</td> <td>0.0200%</td> <td>974</td> <td>2</td> <td>0.2053%</td> </tr> <tr> <td>IFEval</td> <td>0</td> <td>0.0000%</td> <td>541</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AlpacaEval</td> <td>0</td> <td>0.0000%</td> <td>805</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>0</td> <td>0.0000%</td> <td>750</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td rowspan="12">follow_up</td> <td rowspan="12">279,935</td> <td>MATH500</td> <td>51</td> <td>0.0182%</td> <td>500</td> <td>13</td> <td>2.60%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AMC23</td> <td>0</td> <td>0.0000%</td> <td>40</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>JEEBench</td> <td>2</td> <td>0.0007%</td> <td>515</td> <td>2</td> <td>0.3883%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>4</td> <td>0.0014%</td> <td>164</td> <td>3</td> <td>1.83%</td> </tr> <tr> <td>MBPP</td> <td>27</td> <td>0.0096%</td> <td>974</td> <td>20</td> <td>2.05%</td> </tr> <tr> <td>IFEval</td> <td>0</td> <td>0.0000%</td> <td>541</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AlpacaEval</td> <td>21</td> <td>0.0075%</td> <td>805</td> <td>4</td> <td>0.4969%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>2</td> <td>0.0007%</td> <td>750</td> <td>1</td> <td>0.1333%</td> </tr> <tr> <td rowspan="12">fs_cot_flow</td> <td rowspan="12">25,000</td> <td>MATH500</td> <td>66</td> <td>0.2640%</td> <td>500</td> <td>17</td> <td>3.40%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AMC23</td> <td>1</td> <td>0.0040%</td> <td>40</td> <td>1</td> <td>2.50%</td> </tr> <tr> <td>JEEBench</td> <td>2</td> <td>0.0080%</td> <td>515</td> <td>1</td> <td>0.1942%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>1</td> <td>0.0040%</td> <td>164</td> <td>1</td> <td>0.6098%</td> </tr> <tr> <td>MBPP</td> <td>14</td> <td>0.0560%</td> <td>974</td> <td>8</td> <td>0.8214%</td> </tr> <tr> <td>IFEval</td> <td>5</td> <td>0.0200%</td> <td>541</td> <td>1</td> <td>0.1848%</td> </tr> <tr> <td>AlpacaEval</td> <td>25</td> <td>0.1000%</td> <td>805</td> <td>7</td> <td>0.8696%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>4</td> <td>0.0160%</td> <td>750</td> <td>2</td> <td>0.2667%</td> </tr> <tr> <td rowspan="12">mcq</td> <td rowspan="12">99,986</td> <td>MATH500</td> <td>82</td> <td>0.0820%</td> <td>500</td> <td>18</td> <td>3.60%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AMC23</td> <td>0</td> <td>0.0000%</td> <td>40</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>JEEBench</td> <td>1</td> <td>0.0010%</td> <td>515</td> <td>1</td> <td>0.1942%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>0</td> <td>0.0000%</td> <td>164</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>MBPP</td> <td>17</td> <td>0.0170%</td> <td>974</td> <td>6</td> <td>0.6160%</td> </tr> <tr> <td>IFEval</td> <td>0</td> <td>0.0000%</td> <td>541</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AlpacaEval</td> <td>7</td> <td>0.0070%</td> <td>805</td> <td>5</td> <td>0.6211%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>7</td> <td>0.0070%</td> <td>750</td> <td>2</td> <td>0.2667%</td> </tr> <tr> <td rowspan="12">open_domain_qa</td> <td rowspan="12">272,370</td> <td>MATH500</td> <td>283</td> <td>0.1039%</td> <td>500</td> <td>71</td> <td>14.20%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>3</td> <td>0.0011%</td> <td>30</td> <td>1</td> <td>3.33%</td> </tr> <tr> <td>AMC23</td> <td>0</td> <td>0.0000%</td> <td>40</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>JEEBench</td> <td>2</td> <td>0.0007%</td> <td>515</td> <td>2</td> <td>0.3883%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>0</td> <td>0.0000%</td> <td>164</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>MBPP</td> <td>56</td> <td>0.0206%</td> <td>974</td> <td>17</td> <td>1.75%</td> </tr> <tr> <td>IFEval</td> <td>0</td> <td>0.0000%</td> <td>541</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AlpacaEval</td> <td>26</td> <td>0.0095%</td> <td>805</td> <td>4</td> <td>0.4969%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>9</td> <td>0.0033%</td> <td>750</td> <td>2</td> <td>0.2667%</td> </tr> <tr> <td rowspan="12">rag</td> <td rowspan="12">50,000</td> <td>MATH500</td> <td>61</td> <td>0.1220%</td> <td>500</td> <td>13</td> <td>2.60%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>1</td> <td>0.0020%</td> <td>30</td> <td>1</td> <td>3.33%</td> </tr> <tr> <td>AMC23</td> <td>0</td> <td>0.0000%</td> <td>40</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>JEEBench</td> <td>0</td> <td>0.0000%</td> <td>515</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>2</td> <td>0.0040%</td> <td>164</td> <td>3</td> <td>1.83%</td> </tr> <tr> <td>MBPP</td> <td>40</td> <td>0.0800%</td> <td>974</td> <td>31</td> <td>3.18%</td> </tr> <tr> <td>IFEval</td> <td>0</td> <td>0.0000%</td> <td>541</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AlpacaEval</td> <td>25</td> <td>0.0500%</td> <td>805</td> <td>7</td> <td>0.8696%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>4</td> <td>0.0080%</td> <td>750</td> <td>4</td> <td>0.5333%</td> </tr> <tr> <td rowspan="12">rc</td> <td rowspan="12">50,000</td> <td>MATH500</td> <td>59</td> <td>0.1180%</td> <td>500</td> <td>15</td> <td>3.00%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AMC23</td> <td>0</td> <td>0.0000%</td> <td>40</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>JEEBench</td> <td>1</td> <td>0.0020%</td> <td>515</td> <td>1</td> <td>0.1942%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>0</td> <td>0.0000%</td> <td>164</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>MBPP</td> <td>16</td> <td>0.0320%</td> <td>974</td> <td>6</td> <td>0.6160%</td> </tr> <tr> <td>IFEval</td> <td>0</td> <td>0.0000%</td> <td>541</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AlpacaEval</td> <td>12</td> <td>0.0240%</td> <td>805</td> <td>3</td> <td>0.3727%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>2</td> <td>0.0040%</td> <td>750</td> <td>1</td> <td>0.1333%</td> </tr> <tr> <td rowspan="12">struct2text_flow</td> <td rowspan="12">50,000</td> <td>MATH500</td> <td>29</td> <td>0.0580%</td> <td>500</td> <td>4</td> <td>0.8000%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AMC23</td> <td>0</td> <td>0.0000%</td> <td>40</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>JEEBench</td> <td>0</td> <td>0.0000%</td> <td>515</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>0</td> <td>0.0000%</td> <td>164</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>MBPP</td> <td>1</td> <td>0.0020%</td> <td>974</td> <td>2</td> <td>0.2053%</td> </tr> <tr> <td>IFEval</td> <td>2</td> <td>0.0040%</td> <td>541</td> <td>1</td> <td>0.1848%</td> </tr> <tr> <td>AlpacaEval</td> <td>13</td> <td>0.0260%</td> <td>805</td> <td>1</td> <td>0.1242%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>0</td> <td>0.0000%</td> <td>750</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td rowspan="12">text_classification</td> <td rowspan="12">50,000</td> <td>MATH500</td> <td>3</td> <td>0.0060%</td> <td>500</td> <td>2</td> <td>0.4000%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AMC23</td> <td>0</td> <td>0.0000%</td> <td>40</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>JEEBench</td> <td>0</td> <td>0.0000%</td> <td>515</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>0</td> <td>0.0000%</td> <td>164</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>MBPP</td> <td>0</td> <td>0.0000%</td> <td>974</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>IFEval</td> <td>0</td> <td>0.0000%</td> <td>541</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AlpacaEval</td> <td>3</td> <td>0.0060%</td> <td>805</td> <td>1</td> <td>0.1242%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>0</td> <td>0.0000%</td> <td>750</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td rowspan="12">text_extraction</td> <td rowspan="12">50,000</td> <td>MATH500</td> <td>68</td> <td>0.1360%</td> <td>500</td> <td>23</td> <td>4.60%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AMC23</td> <td>0</td> <td>0.0000%</td> <td>40</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>JEEBench</td> <td>0</td> <td>0.0000%</td> <td>515</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>3</td> <td>0.0060%</td> <td>164</td> <td>4</td> <td>2.44%</td> </tr> <tr> <td>MBPP</td> <td>34</td> <td>0.0680%</td> <td>974</td> <td>24</td> <td>2.46%</td> </tr> <tr> <td>IFEval</td> <td>2</td> <td>0.0040%</td> <td>541</td> <td>1</td> <td>0.1848%</td> </tr> <tr> <td>AlpacaEval</td> <td>4</td> <td>0.0080%</td> <td>805</td> <td>3</td> <td>0.3727%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>2</td> <td>0.0040%</td> <td>750</td> <td>1</td> <td>0.1333%</td> </tr> <tr> <td rowspan="12">text_modification</td> <td rowspan="12">50,000</td> <td>MATH500</td> <td>54</td> <td>0.1080%</td> <td>500</td> <td>14</td> <td>2.80%</td> </tr> <tr> <td>AIME24</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AIME25</td> <td>0</td> <td>0.0000%</td> <td>30</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>AMC23</td> <td>0</td> <td>0.0000%</td> <td>40</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>JEEBench</td> <td>1</td> <td>0.0020%</td> <td>515</td> <td>1</td> <td>0.1942%</td> </tr> <tr> <td>GPQADiamond</td> <td>0</td> <td>0.0000%</td> <td>198</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>LiveCodeBench</td> <td>0</td> <td>0.0000%</td> <td>1055</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>HumanEval</td> <td>0</td> <td>0.0000%</td> <td>164</td> <td>0</td> <td>0.0000%</td> </tr> <tr> <td>MBPP</td> <td>8</td> <td>0.0160%</td> <td>974</td> <td>7</td> <td>0.7187%</td> </tr> <tr> <td>IFEval</td> <td>1</td> <td>0.0020%</td> <td>541</td> <td>1</td> <td>0.1848%</td> </tr> <tr> <td>AlpacaEval</td> <td>12</td> <td>0.0240%</td> <td>805</td> <td>1</td> <td>0.1242%</td> </tr> <tr> <td>Arena-Hard-v2.0</td> <td>6</td> <td>0.0120%</td> <td>750</td> <td>4</td> <td>0.5333%</td> </tr> </tbody> </table> ### Dataset summary <table> <thead> <tr><th>Metric</th><th>Value</th></tr> </thead> <tbody> <tr><td>Total documents in dataset</td><td>1,227,291</td></tr> <tr><td>Contaminated documents (removed)</td><td>1,363</td></tr> <tr><td>Documents after decontamination</td><td>1,225,928</td></tr> <tr><td>Contamination rate (dataset)</td><td>0.1111%</td></tr> </tbody> </table> --- ### Dataset Card This dataset is a fully synthetic set of instruction pairs where both the prompts and the responses have been synthetically generated, using the AgentInstruct framework. AgentInstruct is an extensible agentic framework for synthetic data generation. This dataset contains ~1 million instruction pairs generated by the AgentInstruct, using only raw text content publicly avialble on the Web as seeds. The data covers different capabilities, such as text editing, creative writing, coding, reading comprehension, etc. The dataset can be used for instruction tuning of any base LLM. Using the full AgentInstruct data (a superset of this dataset with ~25M instruction pairs), we post-trained Mistral-7b. When comparing the resulting model Orca-3-Mistral to Mistral-7b-Instruct (which uses the same base model), we observe significant improvements across many benchmarks. For example, 40% improvement on AGIEval, 19% improvement on MMLU, 54% improvement on GSM8K, 38% improvement on BBH and 45% improvement on AlpacaEval. Please refer to AgentInstruct: Toward Generative Teaching with Agentic Flows for further details. ### Dataset Sources **Repository:** microsoft/orca-agentinstruct-1M-v1 **Paper:** AgentInstruct: Toward Generative Teaching with Agentic Flows ### Direct Use This is a general dataset created by the AgentInstruct methodology. It contains diverse synthetic data. This could be used for research on instruction tuning and synthetic data use and generaiton. ### Out-of-Scope Use This dataset is being shared for research purposes. For training models to perform real-world tasks, we recommend further testing and validation where needed. This dataset is not intended for use in educational systems or organizations, or for use in health systems. ### Dataset Structure ### Data Instances A typical data entry in the dataset consists of a conversation i.e. sequence of question and its corresponding answer. Below is an example from the dataset: ```json { "messages": [ { "role": "system", "content": "" }, { "role": "user", "content": "Argue the importance of having a strong understanding of physics and mechanics in the design of electron beam systems for semiconductor equipment." }, { "role": "assistant", "content": "Having a strong understanding of physics and mechanics is crucial in the design of electron beam systems for semiconductor equipment. This is because electron beam systems rely heavily on the principles of physics and mechanics to function effectively and efficiently.\n\nFirstly, a strong understanding of physics is necessary to design the electron gun, which is the heart of the electron beam system. The electron gun generates and accelerates the electrons that are used to create the beam. The design of the electron gun must take into account the properties of electrons, such as their charge, mass, and velocity, as well as the principles of electromagnetism and quantum mechanics. Without a thorough understanding of these principles, it would be impossible to design an electron gun that can generate and accelerate electrons with the required energy and precision.\n\nSecondly, a strong understanding of mechanics is essential in the design of the electron beam system's optics. The optics are responsible for shaping and focusing the electron beam, which is critical for achieving high-resolution imaging and precise material processing. The design of the optics must take into account the principles of optics, such as refraction, reflection, and diffraction, as well as the mechanical properties of the components, such as their stiffness, thermal expansion, and vibration resistance. Without a thorough understanding of these principles, it would be impossible to design optics that can shape and focus the electron beam with the required precision and stability.\n\nIn conclusion, a strong understanding of physics and mechanics is essential in the design of electron beam systems for semiconductor equipment. Without this understanding, it would be impossible to design electron guns and optics that can generate and shape the electron beam with the required precision and stability. Therefore, designers of electron beam systems must have a solid foundation in physics and mechanics to ensure that their designs are effective and efficient." }, { "role": "user", "content": "Could you perhaps rephrase your entire explanation using only haiku poetry? It would certainly give your answer a more artistic touch, don't you think?" }, { "role": "assistant", "content": "Electron streams flow,\nPhysics guides their path and speed,\nSemiconduct's soul.\n\nElectron gun's heart,\nQuantum dance, charged particles,\nPrecision's fine art.\n\nOptics shape the beam,\nMechanics' firm hand guides light,\nFocused like a dream.\n\nIn the silicon dance,\nPhysics, mechanics embrace,\nAdvance our expanse.\n\nDesign's core must know,\nNature's laws to harness force,\nFor tech to grow, glow." }, { "role": "user", "content": "Just to make sure I've got it right, your haikus suggest that the electron gun and optics are like artists, requiring precision and a deep understanding of natural laws to create their masterpiece?" }, { "role": "assistant", "content": "Indeed, that's a poetic interpretation. The haikus suggest that the electron gun (the creator of the electron beam) and the optics (which shape and focus the beam) require a precise understanding of physics and mechanics, much like artists need a deep understanding of their tools and materials to create a masterpiece. The natural laws are the guidelines within which they must work to achieve the desired outcome in semiconductor equipment design." }, { "role": "user", "content": "I'd like to understand more about the 'precision's fine art' you mentioned. Can you explain how precision is achieved in the design of the electron gun and optics?" }, { "role": "assistant", "content": "Certainly! Achieving precision in the design of the electron gun and optics in electron beam systems involves several key factors:\n\n1. **Electron Gun Design:**\n - **Vacuum Environment:** The electron gun operates in a high vacuum to prevent electron scattering by air molecules, ensuring a precise and concentrated beam.\n - **Cathode Material:** The choice of cathode material affects the emission efficiency and stability of the electron beam. Materials like tungsten or lanthanum hexaboride are often used for their high melting points and low work functions.\n - **Anode Configuration:** The anode accelerates the electrons and its design, including the shape and voltage applied, determines the beam's energy and focus.\n - **Electromagnetic Lenses:** These lenses control the focus of the electron beam, and their design must account for spherical and chromatic aberrations to maintain precision.\n\n2. **Optics Design:**\n - **Electromagnetic Lenses:** Just like in the electron gun, these lenses in the optical column are used to focus and deflect the electron beam with high precision. The design and alignment of these lenses are critical for achieving the desired spot size and shape on the target.\n - **Stigmators:** These are used to correct for astigmatism in the electron beam, ensuring that the beam remains symmetrical and focused over a range of operating conditions.\n - **Apertures:** Placed at strategic points in the optical column, apertures help to shape the beam and eliminate electrons that stray from the desired path, improving resolution and reducing aberrations.\n - **Thermal Stability:** Precision in electron beam systems also depends on maintaining thermal stability to prevent drift or deformation of the optical components.\n\n3. **Control Systems:**\n - **Feedback Loops:** Precision is maintained through feedback systems that monitor the beam's characteristics and make real-time adjustments to the electron gun and optics to correct any deviations.\n - **Vibration Isolation:** The entire system must be isolated from external vibrations that could disturb the beam's path, often using pneumatic or magnetic isolation systems.\n\n4. **Material Science:**\n - **Material Properties:** The materials used in the construction of the electron gun and optics must be chosen for their stability under the operating conditions, including resistance to thermal expansion and electromagnetic interference.\n\n5. **Computational Modeling:**\n - **Simulation Software:** Advanced software is used to model the electron beam's behavior and interactions with the system's components, allowing for the optimization of the design before physical construction.\n\nIn summary, 'precision's fine art' in the context of electron beam systems refers to the meticulous design and engineering of the electron gun and optics, which are achieved through a deep understanding of physics and materials science, as well as the implementation of sophisticated control and feedback mechanisms." } ], } ``` ### Data Fields The dataset comprises the following fields: messages: String representing a conversation in OpenAI format. ```json [ {"role": "system", "content": "...."}, {"role": "user", "content": "...."}, {"role": "assistant", "content": "...."}, {"role": "user", "content": "...."}, {"role": "assistant", "content": "...."}, .... ] ``` To read the conversation use `json.loads()` ### Data Splits train ### Dataset Creation ### Source Data Please refer to AgentInstruct: Toward Generative Teaching with Agentic Flows for further detail ### Data Collection and Processing Please refer to AgentInstruct: Toward Generative Teaching with Agentic Flows for further details for details about the dataset construction. ### Who are the source data producers? Microsoft ### Annotation process We generate questions and answers using using Azure GPT-4. ### Personal and Sensitive Information None ### Bias, Risks, and Limitations • This dataset is in English. • The dataset inherits the biases, errors, and omissions known to exist in data used for seed sources and models used for data generaiton. • This dataset is not intended to represent any specific domain, and contains generic data. However, the AgentInstruct methodology, which was used to create this dataset, can be used to generate high-quality domain specific data, which can be used to fine-tune any existing model for a specific domain. • The dataset is synthetically gnerated and hence may contain inaccuracies that do not accurately reflect real-world phenomena. • The synthetic nature of this dataset may limit its ability to generalize to real-world cases. • The data is intended for research and exoerumentation for model training and synthetic data generation. ### Citation If you find this work useful in your method, you can cite the paper as below: @misc{ title={ AgentInstruct: Toward Generative Teaching with Agentic Flows}, author={Arindam Mitra, Luciano Del Corro, Guoqing Zheng, Shweti Mahajan, Dany Rouhana, Andres Codas, Yadong Lu, Wei-ge Chen, Olga Vrousgos, Corby Rosset, Fillipe Silva, Hamed Khanpour, Yash Lara, Ahmed Awadallah}, year={2024}, eprint={ 2407.03502}, archivePrefix={arXiv}, primaryClass={cs.CL} } Dataset Card Contact orca-team@microsoft.com

## 数据集基本信息 - 语言:英语(English) - 许可证:CDLA许可2.0(cdla-permissive-2.0) - 样本规模:100万 < 样本数量 < 1000万 - 任务类别:问答(question-answering) ### 数据集详情 #### 数据特征 数据包含`messages`字段,为对话列表,每个对话项包含两个子字段: - `content`:字符串类型,代表对话内容 - `role`:字符串类型,代表对话角色 #### 数据划分 各数据划分的详细信息如下: | 划分名称 | 字节大小 | 样本数量 | |------------------------|----------------|----------| | analytical_reasoning | 98028645 | 24983 | | brain_teaser | 119629503 | 49962 | | code_ | 589940236 | 99931 | | creative_content | 280573464 | 49961 | | fermi | 75519245 | 24984 | | follow_up | 849785042 | 98947 | | fs_cot_flow | 103809090 | 24883 | | mcq | 220938985 | 99873 | | open_domain_qa | 588970010 | 271991 | | rag | 409622336 | 49867 | | rc | 274752766 | 49910 | | struct2text_flow | 240292423 | 49955 | | text_classification | 145309572 | 49994 | | text_extraction | 297532567 | 49887 | | text_modification | 334112924 | 49919 | 下载总大小:4628816808字节,数据集总大小:4628816808字节 ### 配置信息 默认配置(default)下的数据文件路径如下: - analytical_reasoning: data/analytical_reasoning-* - brain_teaser: data/brain_teaser-* - code_: data/code_-* - creative_content: data/creative_content-* - fermi: data/fermi-* - follow_up: data/follow_up-* - fs_cot_flow: data/fs_cot_flow-* - mcq: data/mcq-* - open_domain_qa: data/open_domain_qa-* - rag: data/rag-* - rc: data/rc-* - struct2text_flow: data/struct2text_flow-* - text_classification: data/text_classification-* - text_extraction: data/text_extraction-* - text_modification: data/text_modification-* ## 去污染说明 本数据集源自微软(Microsoft)的`microsoft/orca-agentinstruct-1M-v1`,为其去污染版本。 ### 所用基准数据集 1. **MATH500**:`HuggingFaceH4/MATH-500`(子集=default,划分=test) 2. **2024年美国数学邀请赛(AIME24)**:`HuggingFaceH4/aime_2024`(子集=default,划分=train) 3. **2025年美国数学邀请赛(AIME25)**:`math-ai/aime25`(子集=default,划分=test) 4. **2023年美国数学竞赛(AMC23)**:`math-ai/amc23`(子集=default,划分=test) 5. **JEEBench**:`daman1209arora/jeebench`(子集=default,划分=test) 6. **GPQADiamond**:`Idavidrein/gpqa`(子集=gpqa_diamond,划分=train) 7. **LiveCodeBench**:`ali-elganzory/livecodebench-code_generation_lite`(子集=release_v6,划分=test) 8. **HumanEval**:OpenAI(openai)的`openai_humaneval`(子集=openai_humaneval,划分=test) 9. **MBPP**:谷歌研究数据集(google-research-datasets)的`mbpp`(子集=full,划分=train+test+validation+prompt) 10. **IFEval**:Google(google)的`IFEval`(子集=default,划分=train) 11. **AlpacaEval**:`tatsu-lab/alpaca_eval`(子集=alpaca_eval,划分=eval) 12. **Arena-Hard-v2.0**:`lmarena-ai/arena-hard-auto`(子集=default,划分=train) ### 去污染参数设置 | 参数 | 数值 | |--------------------|--------| | N元组大小 | 8 | | 匹配阈值 | 0.5 | ### 划分与基准污染详情 各数据划分与基准数据集的污染统计详情如下: | 子集 | 划分名称 | 数据集中样本数 | 基准数据集 | 数据集中污染样本数 | 数据集污染率 | 基准数据集样本数 | 基准数据集污染样本数 | 基准数据集污染率 | |--------|------------------------|----------------|------------------|--------------------|--------------|------------------|----------------------|------------------| | default| analytical_reasoning | 25000 | MATH500 | 16 | 0.0640% | 500 | 3 | 0.6000% | | default| analytical_reasoning | 25000 | AIME24 | 0 | 0.0000% | 30 | 0 | 0.0000% | | ... | ... | ... | ... | ... | ... | ... | ... | ... | ## 数据集摘要 | 指标 | 数值 | |--------------------------|--------------------| | 数据集总样本数 | 1,227,291 | | 已移除污染样本数 | 1,363 | | 去污染后剩余样本数 | 1,225,928 | | 数据集整体污染率 | 0.1111% | ## 数据集卡片 本数据集为完全合成的指令对集合,其提示与回复均基于AgentInstruct框架合成生成。AgentInstruct是一款可扩展的智能体框架,用于合成数据生成。 本数据集包含约100万条由AgentInstruct生成的指令对,仅以互联网上公开的原始文本内容作为生成种子。该数据集覆盖多种任务能力,包括文本编辑、创意写作、代码生成、阅读理解等,可用于任意基础大语言模型(Large Language Model, LLM)的指令微调。 我们基于完整的AgentInstruct数据集(为本数据集的超集,包含约2500万条指令对)对Mistral-7b进行了后训练。将得到的模型Orca-3-Mistral与使用相同基础模型的Mistral-7b-Instruct对比后,我们观察到其在众多基准测试中均取得显著性能提升:例如在AGIEval上提升40%,在MMLU上提升19%,在GSM8K上提升54%,在BBH上提升38%,在AlpacaEval上提升45%。更多细节请参考论文《AgentInstruct: Toward Generative Teaching with Agentic Flows》。 ### 数据集来源 - **仓库地址**:`microsoft/orca-agentinstruct-1M-v1` - **关联论文**:《AgentInstruct: Toward Generative Teaching with Agentic Flows》 ### 直接用途 本数据集为基于AgentInstruct方法构建的通用数据集,包含多样化的合成数据,可用于指令微调、合成数据应用与生成相关的研究。 ### 禁止使用场景 本数据集仅用于学术研究分享。若用于训练可落地的现实世界任务模型,建议根据需求开展额外的测试与验证工作。本数据集不得用于教育机构/系统或医疗系统相关场景。 ### 数据集结构 #### 数据实例 数据集中的典型条目为一段对话,即一系列问题与对应回复的序列。示例如下: json { "messages": [ { "role": "system", "content": "" }, { "role": "user", "content": "Argue the importance of having a strong understanding of physics and mechanics in the design of electron beam systems for semiconductor equipment." }, { "role": "assistant", "content": "Having a strong understanding of physics and mechanics is crucial in the design of electron beam systems for semiconductor equipment. This is because electron beam systems rely heavily on the principles of physics and mechanics to function effectively and efficiently. Firstly, a strong understanding of physics is necessary to design the electron gun, which is the heart of the electron beam system. The electron gun generates and accelerates the electrons that are used to create the beam. The design of the electron gun must take into account the properties of electrons, such as their charge, mass, and velocity, as well as the principles of electromagnetism and quantum mechanics. Without a thorough understanding of these principles, it would be impossible to design an electron gun that can generate and accelerate electrons with the required energy and precision. Secondly, a strong understanding of mechanics is essential in the design of the electron beam system's optics. The optics are responsible for shaping and focusing the electron beam, which is critical for achieving high-resolution imaging and precise material processing. The design of the optics must take into account the principles of optics, such as refraction, reflection, and diffraction, as well as the mechanical properties of the components, such as their stiffness, thermal expansion, and vibration resistance. Without a thorough understanding of these principles, it would be impossible to design optics that can shape and focus the electron beam with the required precision and stability. In conclusion, a strong understanding of physics and mechanics is essential in the design of electron beam systems for semiconductor equipment. Without this understanding, it would be impossible to design electron guns and optics that can generate and shape the electron beam with the required precision and stability. Therefore, designers of electron beam systems must have a solid foundation in physics and mechanics to ensure that their designs are effective and efficient." } ] } #### 数据字段 数据集包含以下字段: - `messages`:以OpenAI格式表示的对话字符串,格式示例如下: json [ {"role": "system", "content": "...."}, {"role": "user", "content": "...."}, {"role": "assistant", "content": "...."}, {"role": "user", "content": "...."}, {"role": "assistant", "content": "...."}, .... ] 可通过`json.loads()`解析该对话字段。 #### 数据划分 仅包含训练集(train)。 ### 数据集创建 #### 源数据 源数据详情请参考《AgentInstruct: Toward Generative Teaching with Agentic Flows》。 #### 数据收集与处理 详情请参考《AgentInstruct: Toward Generative Teaching with Agentic Flows》。 #### 源数据生产者 微软(Microsoft) #### 标注流程 我们通过Azure GPT-4生成问题与回复。 #### 个人与敏感信息 无个人或敏感信息。 ### 偏见、风险与限制 1. 本数据集仅包含英文内容。 2. 本数据集继承了生成种子数据与生成模型本身已有的偏见、错误与遗漏。 3. 本数据集并非针对特定领域构建,仅包含通用数据。但本数据集所使用的AgentInstruct方法可用于生成高质量的特定领域合成数据,用于对现有模型进行特定领域的微调。 4. 本数据集为合成生成,因此可能包含与现实世界现象不符的不准确之处。 5. 本数据集的合成属性可能限制其在现实世界场景中的泛化能力。 6. 本数据集仅用于模型训练与合成数据生成相关的研究与实验。 ### 引用 如果您的研究用到本数据集,请引用如下论文: bibtex @misc{ title={ AgentInstruct: Toward Generative Teaching with Agentic Flows}, author={Arindam Mitra, Luciano Del Corro, Guoqing Zheng, Shweti Mahajan, Dany Rouhana, Andres Codas, Yadong Lu, Wei-ge Chen, Olga Vrousgos, Corby Rosset, Fillipe Silva, Hamed Khanpour, Yash Lara, Ahmed Awadallah}, year={2024}, eprint={ 2407.03502}, archivePrefix={arXiv}, primaryClass={cs.CL} } ### 数据集卡片联系人 orca-team@microsoft.com
提供机构:
openeurollm
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作