five

open-llm-leaderboard-old/details_gpt2

收藏
Hugging Face2024-03-23 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/open-llm-leaderboard-old/details_gpt2
下载链接
链接失效反馈
官方服务:
资源简介:
--- pretty_name: Evaluation run of gpt2 dataset_summary: "Dataset automatically created during the evaluation run of model\ \ [gpt2](https://huggingface.co/gpt2) on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).\n\ \nThe dataset is composed of 65 configuration, each one coresponding to one of the\ \ evaluated task.\n\nThe dataset has been created from 25 run(s). Each run can be\ \ found as a specific split in each configuration, the split being named using the\ \ timestamp of the run.The \"train\" split is always pointing to the latest results.\n\ \nAn additional configuration \"results\" store all the aggregated results of the\ \ run (and is used to compute and display the aggregated metrics on the [Open LLM\ \ Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)).\n\ \nTo load the details from a run, you can for instance do the following:\n```python\n\ from datasets import load_dataset\ndata = load_dataset(\"open-llm-leaderboard/details_gpt2\"\ ,\n\t\"harness_winogrande_5\",\n\tsplit=\"train\")\n```\n\n## Latest results\n\n\ These are the [latest results from run 2024-03-23T06:18:16.565546](https://huggingface.co/datasets/open-llm-leaderboard/details_gpt2/blob/main/results_2024-03-23T06-18-16.565546.json)(note\ \ that their might be results for other tasks in the repos if successive evals didn't\ \ cover the same tasks. You find each in the results and the \"latest\" split for\ \ each eval):\n\n```python\n{\n \"all\": {\n \"acc\": 0.25780579051672486,\n\ \ \"acc_stderr\": 0.030658881019520554,\n \"acc_norm\": 0.2586547713391113,\n\ \ \"acc_norm_stderr\": 0.031431381356225356,\n \"mc1\": 0.22766217870257038,\n\ \ \"mc1_stderr\": 0.01467925503211107,\n \"mc2\": 0.4069116400376613,\n\ \ \"mc2_stderr\": 0.014934250122346554\n },\n \"harness|arc:challenge|25\"\ : {\n \"acc\": 0.197098976109215,\n \"acc_stderr\": 0.011625047669880633,\n\ \ \"acc_norm\": 0.22013651877133106,\n \"acc_norm_stderr\": 0.01210812488346097\n\ \ },\n \"harness|hellaswag|10\": {\n \"acc\": 0.29267078271260705,\n\ \ \"acc_stderr\": 0.004540586983229993,\n \"acc_norm\": 0.3152758414658435,\n\ \ \"acc_norm_stderr\": 0.0046367607625228515\n },\n \"harness|hendrycksTest-abstract_algebra|5\"\ : {\n \"acc\": 0.21,\n \"acc_stderr\": 0.040936018074033256,\n \ \ \"acc_norm\": 0.21,\n \"acc_norm_stderr\": 0.040936018074033256\n \ \ },\n \"harness|hendrycksTest-anatomy|5\": {\n \"acc\": 0.22962962962962963,\n\ \ \"acc_stderr\": 0.03633384414073462,\n \"acc_norm\": 0.22962962962962963,\n\ \ \"acc_norm_stderr\": 0.03633384414073462\n },\n \"harness|hendrycksTest-astronomy|5\"\ : {\n \"acc\": 0.16447368421052633,\n \"acc_stderr\": 0.0301675334686327,\n\ \ \"acc_norm\": 0.16447368421052633,\n \"acc_norm_stderr\": 0.0301675334686327\n\ \ },\n \"harness|hendrycksTest-business_ethics|5\": {\n \"acc\": 0.17,\n\ \ \"acc_stderr\": 0.0377525168068637,\n \"acc_norm\": 0.17,\n \ \ \"acc_norm_stderr\": 0.0377525168068637\n },\n \"harness|hendrycksTest-clinical_knowledge|5\"\ : {\n \"acc\": 0.24150943396226415,\n \"acc_stderr\": 0.026341480371118345,\n\ \ \"acc_norm\": 0.24150943396226415,\n \"acc_norm_stderr\": 0.026341480371118345\n\ \ },\n \"harness|hendrycksTest-college_biology|5\": {\n \"acc\": 0.2222222222222222,\n\ \ \"acc_stderr\": 0.03476590104304134,\n \"acc_norm\": 0.2222222222222222,\n\ \ \"acc_norm_stderr\": 0.03476590104304134\n },\n \"harness|hendrycksTest-college_chemistry|5\"\ : {\n \"acc\": 0.2,\n \"acc_stderr\": 0.04020151261036846,\n \ \ \"acc_norm\": 0.2,\n \"acc_norm_stderr\": 0.04020151261036846\n },\n\ \ \"harness|hendrycksTest-college_computer_science|5\": {\n \"acc\": 0.28,\n\ \ \"acc_stderr\": 0.04512608598542128,\n \"acc_norm\": 0.28,\n \ \ \"acc_norm_stderr\": 0.04512608598542128\n },\n \"harness|hendrycksTest-college_mathematics|5\"\ : {\n \"acc\": 0.3,\n \"acc_stderr\": 0.046056618647183814,\n \ \ \"acc_norm\": 0.3,\n \"acc_norm_stderr\": 0.046056618647183814\n \ \ },\n \"harness|hendrycksTest-college_medicine|5\": {\n \"acc\": 0.24277456647398843,\n\ \ \"acc_stderr\": 0.0326926380614177,\n \"acc_norm\": 0.24277456647398843,\n\ \ \"acc_norm_stderr\": 0.0326926380614177\n },\n \"harness|hendrycksTest-college_physics|5\"\ : {\n \"acc\": 0.2549019607843137,\n \"acc_stderr\": 0.043364327079931785,\n\ \ \"acc_norm\": 0.2549019607843137,\n \"acc_norm_stderr\": 0.043364327079931785\n\ \ },\n \"harness|hendrycksTest-computer_security|5\": {\n \"acc\":\ \ 0.16,\n \"acc_stderr\": 0.03684529491774709,\n \"acc_norm\": 0.16,\n\ \ \"acc_norm_stderr\": 0.03684529491774709\n },\n \"harness|hendrycksTest-conceptual_physics|5\"\ : {\n \"acc\": 0.2723404255319149,\n \"acc_stderr\": 0.029101290698386698,\n\ \ \"acc_norm\": 0.2723404255319149,\n \"acc_norm_stderr\": 0.029101290698386698\n\ \ },\n \"harness|hendrycksTest-econometrics|5\": {\n \"acc\": 0.2631578947368421,\n\ \ \"acc_stderr\": 0.041424397194893624,\n \"acc_norm\": 0.2631578947368421,\n\ \ \"acc_norm_stderr\": 0.041424397194893624\n },\n \"harness|hendrycksTest-electrical_engineering|5\"\ : {\n \"acc\": 0.2413793103448276,\n \"acc_stderr\": 0.03565998174135302,\n\ \ \"acc_norm\": 0.2413793103448276,\n \"acc_norm_stderr\": 0.03565998174135302\n\ \ },\n \"harness|hendrycksTest-elementary_mathematics|5\": {\n \"acc\"\ : 0.25396825396825395,\n \"acc_stderr\": 0.022418042891113942,\n \"\ acc_norm\": 0.25396825396825395,\n \"acc_norm_stderr\": 0.022418042891113942\n\ \ },\n \"harness|hendrycksTest-formal_logic|5\": {\n \"acc\": 0.14285714285714285,\n\ \ \"acc_stderr\": 0.0312984318574381,\n \"acc_norm\": 0.14285714285714285,\n\ \ \"acc_norm_stderr\": 0.0312984318574381\n },\n \"harness|hendrycksTest-global_facts|5\"\ : {\n \"acc\": 0.15,\n \"acc_stderr\": 0.035887028128263686,\n \ \ \"acc_norm\": 0.15,\n \"acc_norm_stderr\": 0.035887028128263686\n \ \ },\n \"harness|hendrycksTest-high_school_biology|5\": {\n \"acc\"\ : 0.2967741935483871,\n \"acc_stderr\": 0.025988500792411894,\n \"\ acc_norm\": 0.2967741935483871,\n \"acc_norm_stderr\": 0.025988500792411894\n\ \ },\n \"harness|hendrycksTest-high_school_chemistry|5\": {\n \"acc\"\ : 0.270935960591133,\n \"acc_stderr\": 0.03127090713297698,\n \"acc_norm\"\ : 0.270935960591133,\n \"acc_norm_stderr\": 0.03127090713297698\n },\n\ \ \"harness|hendrycksTest-high_school_computer_science|5\": {\n \"acc\"\ : 0.26,\n \"acc_stderr\": 0.04408440022768079,\n \"acc_norm\": 0.26,\n\ \ \"acc_norm_stderr\": 0.04408440022768079\n },\n \"harness|hendrycksTest-high_school_european_history|5\"\ : {\n \"acc\": 0.21818181818181817,\n \"acc_stderr\": 0.03225078108306289,\n\ \ \"acc_norm\": 0.21818181818181817,\n \"acc_norm_stderr\": 0.03225078108306289\n\ \ },\n \"harness|hendrycksTest-high_school_geography|5\": {\n \"acc\"\ : 0.35353535353535354,\n \"acc_stderr\": 0.03406086723547153,\n \"\ acc_norm\": 0.35353535353535354,\n \"acc_norm_stderr\": 0.03406086723547153\n\ \ },\n \"harness|hendrycksTest-high_school_government_and_politics|5\": {\n\ \ \"acc\": 0.36787564766839376,\n \"acc_stderr\": 0.03480175668466036,\n\ \ \"acc_norm\": 0.36787564766839376,\n \"acc_norm_stderr\": 0.03480175668466036\n\ \ },\n \"harness|hendrycksTest-high_school_macroeconomics|5\": {\n \ \ \"acc\": 0.2717948717948718,\n \"acc_stderr\": 0.022556551010132358,\n\ \ \"acc_norm\": 0.2717948717948718,\n \"acc_norm_stderr\": 0.022556551010132358\n\ \ },\n \"harness|hendrycksTest-high_school_mathematics|5\": {\n \"\ acc\": 0.26296296296296295,\n \"acc_stderr\": 0.026842057873833706,\n \ \ \"acc_norm\": 0.26296296296296295,\n \"acc_norm_stderr\": 0.026842057873833706\n\ \ },\n \"harness|hendrycksTest-high_school_microeconomics|5\": {\n \ \ \"acc\": 0.28991596638655465,\n \"acc_stderr\": 0.029472485833136098,\n\ \ \"acc_norm\": 0.28991596638655465,\n \"acc_norm_stderr\": 0.029472485833136098\n\ \ },\n \"harness|hendrycksTest-high_school_physics|5\": {\n \"acc\"\ : 0.271523178807947,\n \"acc_stderr\": 0.03631329803969654,\n \"acc_norm\"\ : 0.271523178807947,\n \"acc_norm_stderr\": 0.03631329803969654\n },\n\ \ \"harness|hendrycksTest-high_school_psychology|5\": {\n \"acc\": 0.3486238532110092,\n\ \ \"acc_stderr\": 0.020431254090714328,\n \"acc_norm\": 0.3486238532110092,\n\ \ \"acc_norm_stderr\": 0.020431254090714328\n },\n \"harness|hendrycksTest-high_school_statistics|5\"\ : {\n \"acc\": 0.4722222222222222,\n \"acc_stderr\": 0.0340470532865388,\n\ \ \"acc_norm\": 0.4722222222222222,\n \"acc_norm_stderr\": 0.0340470532865388\n\ \ },\n \"harness|hendrycksTest-high_school_us_history|5\": {\n \"acc\"\ : 0.25,\n \"acc_stderr\": 0.03039153369274154,\n \"acc_norm\": 0.25,\n\ \ \"acc_norm_stderr\": 0.03039153369274154\n },\n \"harness|hendrycksTest-high_school_world_history|5\"\ : {\n \"acc\": 0.24472573839662448,\n \"acc_stderr\": 0.027985699387036416,\n\ \ \"acc_norm\": 0.24472573839662448,\n \"acc_norm_stderr\": 0.027985699387036416\n\ \ },\n \"harness|hendrycksTest-human_aging|5\": {\n \"acc\": 0.2914798206278027,\n\ \ \"acc_stderr\": 0.030500283176545923,\n \"acc_norm\": 0.2914798206278027,\n\ \ \"acc_norm_stderr\": 0.030500283176545923\n },\n \"harness|hendrycksTest-human_sexuality|5\"\ : {\n \"acc\": 0.26717557251908397,\n \"acc_stderr\": 0.038808483010823944,\n\ \ \"acc_norm\": 0.26717557251908397,\n \"acc_norm_stderr\": 0.038808483010823944\n\ \ },\n \"harness|hendrycksTest-international_law|5\": {\n \"acc\":\ \ 0.32231404958677684,\n \"acc_stderr\": 0.04266416363352168,\n \"\ acc_norm\": 0.32231404958677684,\n \"acc_norm_stderr\": 0.04266416363352168\n\ \ },\n \"harness|hendrycksTest-jurisprudence|5\": {\n \"acc\": 0.21296296296296297,\n\ \ \"acc_stderr\": 0.03957835471980981,\n \"acc_norm\": 0.21296296296296297,\n\ \ \"acc_norm_stderr\": 0.03957835471980981\n },\n \"harness|hendrycksTest-logical_fallacies|5\"\ : {\n \"acc\": 0.26380368098159507,\n \"acc_stderr\": 0.03462419931615623,\n\ \ \"acc_norm\": 0.26380368098159507,\n \"acc_norm_stderr\": 0.03462419931615623\n\ \ },\n \"harness|hendrycksTest-machine_learning|5\": {\n \"acc\": 0.25892857142857145,\n\ \ \"acc_stderr\": 0.041577515398656284,\n \"acc_norm\": 0.25892857142857145,\n\ \ \"acc_norm_stderr\": 0.041577515398656284\n },\n \"harness|hendrycksTest-management|5\"\ : {\n \"acc\": 0.34951456310679613,\n \"acc_stderr\": 0.04721188506097173,\n\ \ \"acc_norm\": 0.34951456310679613,\n \"acc_norm_stderr\": 0.04721188506097173\n\ \ },\n \"harness|hendrycksTest-marketing|5\": {\n \"acc\": 0.1794871794871795,\n\ \ \"acc_stderr\": 0.025140935950335418,\n \"acc_norm\": 0.1794871794871795,\n\ \ \"acc_norm_stderr\": 0.025140935950335418\n },\n \"harness|hendrycksTest-medical_genetics|5\"\ : {\n \"acc\": 0.27,\n \"acc_stderr\": 0.044619604333847394,\n \ \ \"acc_norm\": 0.27,\n \"acc_norm_stderr\": 0.044619604333847394\n \ \ },\n \"harness|hendrycksTest-miscellaneous|5\": {\n \"acc\": 0.21583652618135377,\n\ \ \"acc_stderr\": 0.014711684386139958,\n \"acc_norm\": 0.21583652618135377,\n\ \ \"acc_norm_stderr\": 0.014711684386139958\n },\n \"harness|hendrycksTest-moral_disputes|5\"\ : {\n \"acc\": 0.24277456647398843,\n \"acc_stderr\": 0.0230836585869842,\n\ \ \"acc_norm\": 0.24277456647398843,\n \"acc_norm_stderr\": 0.0230836585869842\n\ \ },\n \"harness|hendrycksTest-moral_scenarios|5\": {\n \"acc\": 0.2424581005586592,\n\ \ \"acc_stderr\": 0.014333522059217889,\n \"acc_norm\": 0.2424581005586592,\n\ \ \"acc_norm_stderr\": 0.014333522059217889\n },\n \"harness|hendrycksTest-nutrition|5\"\ : {\n \"acc\": 0.21895424836601307,\n \"acc_stderr\": 0.02367908986180772,\n\ \ \"acc_norm\": 0.21895424836601307,\n \"acc_norm_stderr\": 0.02367908986180772\n\ \ },\n \"harness|hendrycksTest-philosophy|5\": {\n \"acc\": 0.24758842443729903,\n\ \ \"acc_stderr\": 0.024513879973621967,\n \"acc_norm\": 0.24758842443729903,\n\ \ \"acc_norm_stderr\": 0.024513879973621967\n },\n \"harness|hendrycksTest-prehistory|5\"\ : {\n \"acc\": 0.22530864197530864,\n \"acc_stderr\": 0.023246202647819746,\n\ \ \"acc_norm\": 0.22530864197530864,\n \"acc_norm_stderr\": 0.023246202647819746\n\ \ },\n \"harness|hendrycksTest-professional_accounting|5\": {\n \"\ acc\": 0.26595744680851063,\n \"acc_stderr\": 0.026358065698880592,\n \ \ \"acc_norm\": 0.26595744680851063,\n \"acc_norm_stderr\": 0.026358065698880592\n\ \ },\n \"harness|hendrycksTest-professional_law|5\": {\n \"acc\": 0.2457627118644068,\n\ \ \"acc_stderr\": 0.010996156635142692,\n \"acc_norm\": 0.2457627118644068,\n\ \ \"acc_norm_stderr\": 0.010996156635142692\n },\n \"harness|hendrycksTest-professional_medicine|5\"\ : {\n \"acc\": 0.44485294117647056,\n \"acc_stderr\": 0.030187532060329376,\n\ \ \"acc_norm\": 0.44485294117647056,\n \"acc_norm_stderr\": 0.030187532060329376\n\ \ },\n \"harness|hendrycksTest-professional_psychology|5\": {\n \"\ acc\": 0.26143790849673204,\n \"acc_stderr\": 0.017776947157528034,\n \ \ \"acc_norm\": 0.26143790849673204,\n \"acc_norm_stderr\": 0.017776947157528034\n\ \ },\n \"harness|hendrycksTest-public_relations|5\": {\n \"acc\": 0.21818181818181817,\n\ \ \"acc_stderr\": 0.03955932861795833,\n \"acc_norm\": 0.21818181818181817,\n\ \ \"acc_norm_stderr\": 0.03955932861795833\n },\n \"harness|hendrycksTest-security_studies|5\"\ : {\n \"acc\": 0.4,\n \"acc_stderr\": 0.031362502409358936,\n \ \ \"acc_norm\": 0.4,\n \"acc_norm_stderr\": 0.031362502409358936\n \ \ },\n \"harness|hendrycksTest-sociology|5\": {\n \"acc\": 0.22885572139303484,\n\ \ \"acc_stderr\": 0.029705284056772426,\n \"acc_norm\": 0.22885572139303484,\n\ \ \"acc_norm_stderr\": 0.029705284056772426\n },\n \"harness|hendrycksTest-us_foreign_policy|5\"\ : {\n \"acc\": 0.27,\n \"acc_stderr\": 0.04461960433384739,\n \ \ \"acc_norm\": 0.27,\n \"acc_norm_stderr\": 0.04461960433384739\n \ \ },\n \"harness|hendrycksTest-virology|5\": {\n \"acc\": 0.1927710843373494,\n\ \ \"acc_stderr\": 0.030709824050565274,\n \"acc_norm\": 0.1927710843373494,\n\ \ \"acc_norm_stderr\": 0.030709824050565274\n },\n \"harness|hendrycksTest-world_religions|5\"\ : {\n \"acc\": 0.21052631578947367,\n \"acc_stderr\": 0.0312678171466318,\n\ \ \"acc_norm\": 0.21052631578947367,\n \"acc_norm_stderr\": 0.0312678171466318\n\ \ },\n \"harness|truthfulqa:mc|0\": {\n \"mc1\": 0.22766217870257038,\n\ \ \"mc1_stderr\": 0.01467925503211107,\n \"mc2\": 0.4069116400376613,\n\ \ \"mc2_stderr\": 0.014934250122346554\n },\n \"harness|winogrande|5\"\ : {\n \"acc\": 0.5043409629044988,\n \"acc_stderr\": 0.014051956064076887\n\ \ },\n \"harness|gsm8k|5\": {\n \"acc\": 0.006823351023502654,\n \ \ \"acc_stderr\": 0.0022675371022544736\n }\n}\n```" repo_url: https://huggingface.co/gpt2 leaderboard_url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard point_of_contact: clementine@hf.co configs: - config_name: harness_arc_challenge_25 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|arc:challenge|25_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|arc:challenge|25_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|arc:challenge|25_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|arc:challenge|25_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|arc:challenge|25_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|arc:challenge|25_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|arc:challenge|25_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|arc:challenge|25_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|arc:challenge|25_2024-03-23T06-18-16.565546.parquet' - config_name: harness_drop_0 data_files: - split: 2023_09_14T13_54_21.687636 path: - '**/details_harness|drop|0_2023-09-14T13-54-21.687636.parquet' - split: 2023_09_15T12_28_23.937147 path: - '**/details_harness|drop|0_2023-09-15T12-28-23.937147.parquet' - split: 2023_09_15T12_47_31.231445 path: - '**/details_harness|drop|0_2023-09-15T12-47-31.231445.parquet' - split: latest path: - '**/details_harness|drop|0_2023-09-15T12-47-31.231445.parquet' - config_name: harness_drop_3 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|drop|3_2023-11-21T18-07-07.067275.parquet' - split: 2023_11_29T12_47_35.686694 path: - '**/details_harness|drop|3_2023-11-29T12-47-35.686694.parquet' - split: 2023_11_29T12_58_42.860611 path: - '**/details_harness|drop|3_2023-11-29T12-58-42.860611.parquet' - split: latest path: - '**/details_harness|drop|3_2023-11-29T12-58-42.860611.parquet' - config_name: harness_gsm8k_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|gsm8k|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_11_29T12_47_35.686694 path: - '**/details_harness|gsm8k|5_2023-11-29T12-47-35.686694.parquet' - split: 2023_11_29T12_58_42.860611 path: - '**/details_harness|gsm8k|5_2023-11-29T12-58-42.860611.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|gsm8k|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|gsm8k|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|gsm8k|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|gsm8k|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|gsm8k|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|gsm8k|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|gsm8k|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|gsm8k|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hellaswag_10 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hellaswag|10_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hellaswag|10_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hellaswag|10_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hellaswag|10_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hellaswag|10_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hellaswag|10_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hellaswag|10_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hellaswag|10_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hellaswag|10_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-anatomy|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-astronomy|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-business_ethics|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-college_biology|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-college_chemistry|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-college_computer_science|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-college_mathematics|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-college_medicine|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-college_physics|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-computer_security|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-econometrics|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-formal_logic|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-global_facts|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-high_school_biology|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-high_school_geography|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-high_school_physics|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-human_aging|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-human_sexuality|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-international_law|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-jurisprudence|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-machine_learning|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-management|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-marketing|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-medical_genetics|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-miscellaneous|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-moral_disputes|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-nutrition|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-philosophy|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-prehistory|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-professional_accounting|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-professional_law|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-professional_medicine|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-professional_psychology|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-public_relations|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-security_studies|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-sociology|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-virology|5_2023-11-21T18-07-07.067275.parquet' - '**/details_harness|hendrycksTest-world_religions|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-anatomy|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-astronomy|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-business_ethics|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-college_biology|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-college_chemistry|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-college_computer_science|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-college_mathematics|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-college_medicine|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-college_physics|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-computer_security|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-econometrics|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-formal_logic|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-global_facts|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-high_school_biology|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-high_school_geography|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-high_school_physics|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-human_aging|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-human_sexuality|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-international_law|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-jurisprudence|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-machine_learning|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-management|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-marketing|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-medical_genetics|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-miscellaneous|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-moral_disputes|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-nutrition|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-philosophy|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-prehistory|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-professional_accounting|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-professional_law|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-professional_medicine|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-professional_psychology|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-public_relations|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-security_studies|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-sociology|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-virology|5_2023-12-16T13-32-55.332102.parquet' - '**/details_harness|hendrycksTest-world_religions|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-anatomy|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-astronomy|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-business_ethics|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-college_biology|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-college_chemistry|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-college_computer_science|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-college_mathematics|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-college_medicine|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-college_physics|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-computer_security|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-econometrics|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-formal_logic|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-global_facts|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-high_school_biology|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-high_school_geography|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-high_school_physics|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-human_aging|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-human_sexuality|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-international_law|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-jurisprudence|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-machine_learning|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-management|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-marketing|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-medical_genetics|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-miscellaneous|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-moral_disputes|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-nutrition|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-philosophy|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-prehistory|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-professional_accounting|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-professional_law|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-professional_medicine|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-professional_psychology|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-public_relations|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-security_studies|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-sociology|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-virology|5_2023-12-19T14-19-42.718116.parquet' - '**/details_harness|hendrycksTest-world_religions|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-anatomy|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-astronomy|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-business_ethics|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-college_biology|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-college_chemistry|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-college_computer_science|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-college_mathematics|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-college_medicine|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-college_physics|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-computer_security|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-econometrics|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-formal_logic|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-global_facts|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-high_school_biology|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-high_school_geography|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-high_school_physics|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-human_aging|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-human_sexuality|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-international_law|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-jurisprudence|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-machine_learning|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-management|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-marketing|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-medical_genetics|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-miscellaneous|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-moral_disputes|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-nutrition|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-philosophy|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-prehistory|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-professional_accounting|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-professional_law|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-professional_medicine|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-professional_psychology|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-public_relations|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-security_studies|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-sociology|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-virology|5_2023-12-23T15-28-59.872701.parquet' - '**/details_harness|hendrycksTest-world_religions|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-anatomy|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-astronomy|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-business_ethics|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-clinical_knowledge|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-college_biology|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-college_chemistry|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-college_computer_science|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-college_mathematics|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-college_medicine|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-college_physics|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-computer_security|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-conceptual_physics|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-econometrics|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-electrical_engineering|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-elementary_mathematics|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-formal_logic|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-global_facts|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-high_school_biology|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-high_school_chemistry|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-high_school_computer_science|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-high_school_european_history|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-high_school_geography|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-high_school_mathematics|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-high_school_physics|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-high_school_psychology|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-high_school_statistics|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-high_school_us_history|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-high_school_world_history|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-human_aging|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-human_sexuality|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-international_law|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-jurisprudence|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-logical_fallacies|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-machine_learning|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-management|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-marketing|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-medical_genetics|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-miscellaneous|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-moral_disputes|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-moral_scenarios|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-nutrition|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-philosophy|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-prehistory|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-professional_accounting|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-professional_law|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-professional_medicine|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-professional_psychology|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-public_relations|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-security_studies|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-sociology|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-us_foreign_policy|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-virology|5_2024-01-10T14-42-55.873500.parquet' - '**/details_harness|hendrycksTest-world_religions|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-anatomy|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-astronomy|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-business_ethics|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-clinical_knowledge|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-college_biology|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-college_chemistry|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-college_computer_science|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-college_mathematics|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-college_medicine|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-college_physics|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-computer_security|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-conceptual_physics|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-econometrics|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-electrical_engineering|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-elementary_mathematics|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-formal_logic|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-global_facts|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-high_school_biology|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-high_school_chemistry|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-high_school_computer_science|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-high_school_european_history|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-high_school_geography|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-high_school_mathematics|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-high_school_physics|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-high_school_psychology|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-high_school_statistics|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-high_school_us_history|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-high_school_world_history|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-human_aging|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-human_sexuality|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-international_law|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-jurisprudence|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-logical_fallacies|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-machine_learning|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-management|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-marketing|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-medical_genetics|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-miscellaneous|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-moral_disputes|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-moral_scenarios|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-nutrition|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-philosophy|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-prehistory|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-professional_accounting|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-professional_law|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-professional_medicine|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-professional_psychology|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-public_relations|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-security_studies|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-sociology|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-us_foreign_policy|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-virology|5_2024-01-18T14-12-21.064569.parquet' - '**/details_harness|hendrycksTest-world_religions|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-anatomy|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-astronomy|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-business_ethics|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-clinical_knowledge|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-college_biology|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-college_chemistry|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-college_computer_science|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-college_mathematics|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-college_medicine|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-college_physics|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-computer_security|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-conceptual_physics|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-econometrics|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-electrical_engineering|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-elementary_mathematics|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-formal_logic|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-global_facts|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-high_school_biology|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-high_school_chemistry|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-high_school_computer_science|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-high_school_european_history|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-high_school_geography|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-high_school_mathematics|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-high_school_physics|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-high_school_psychology|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-high_school_statistics|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-high_school_us_history|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-high_school_world_history|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-human_aging|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-human_sexuality|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-international_law|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-jurisprudence|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-logical_fallacies|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-machine_learning|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-management|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-marketing|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-medical_genetics|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-miscellaneous|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-moral_disputes|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-moral_scenarios|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-nutrition|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-philosophy|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-prehistory|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-professional_accounting|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-professional_law|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-professional_medicine|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-professional_psychology|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-public_relations|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-security_studies|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-sociology|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-us_foreign_policy|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-virology|5_2024-01-22T13-56-20.291666.parquet' - '**/details_harness|hendrycksTest-world_religions|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-anatomy|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-astronomy|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-business_ethics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-clinical_knowledge|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-college_biology|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-college_chemistry|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-college_computer_science|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-college_mathematics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-college_medicine|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-college_physics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-computer_security|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-conceptual_physics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-econometrics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-electrical_engineering|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-elementary_mathematics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-formal_logic|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-global_facts|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_biology|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_chemistry|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_computer_science|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_european_history|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_geography|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_mathematics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_physics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_psychology|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_statistics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_us_history|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_world_history|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-human_aging|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-human_sexuality|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-international_law|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-jurisprudence|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-logical_fallacies|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-machine_learning|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-management|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-marketing|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-medical_genetics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-miscellaneous|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-moral_disputes|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-moral_scenarios|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-nutrition|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-philosophy|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-prehistory|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-professional_accounting|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-professional_law|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-professional_medicine|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-professional_psychology|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-public_relations|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-security_studies|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-sociology|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-us_foreign_policy|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-virology|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-world_religions|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-anatomy|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-astronomy|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-business_ethics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-clinical_knowledge|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-college_biology|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-college_chemistry|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-college_computer_science|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-college_mathematics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-college_medicine|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-college_physics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-computer_security|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-conceptual_physics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-econometrics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-electrical_engineering|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-elementary_mathematics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-formal_logic|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-global_facts|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_biology|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_chemistry|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_computer_science|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_european_history|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_geography|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_mathematics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_physics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_psychology|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_statistics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_us_history|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-high_school_world_history|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-human_aging|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-human_sexuality|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-international_law|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-jurisprudence|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-logical_fallacies|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-machine_learning|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-management|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-marketing|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-medical_genetics|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-miscellaneous|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-moral_disputes|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-moral_scenarios|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-nutrition|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-philosophy|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-prehistory|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-professional_accounting|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-professional_law|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-professional_medicine|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-professional_psychology|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-public_relations|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-security_studies|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-sociology|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-us_foreign_policy|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-virology|5_2024-03-23T06-18-16.565546.parquet' - '**/details_harness|hendrycksTest-world_religions|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_abstract_algebra_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_anatomy_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-anatomy|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-anatomy|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-anatomy|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-anatomy|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-anatomy|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-anatomy|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-anatomy|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-anatomy|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-anatomy|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_astronomy_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-astronomy|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-astronomy|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-astronomy|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-astronomy|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-astronomy|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-astronomy|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-astronomy|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-astronomy|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-astronomy|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_business_ethics_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-business_ethics|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-business_ethics|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-business_ethics|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-business_ethics|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-business_ethics|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-business_ethics|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-business_ethics|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-business_ethics|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-business_ethics|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_clinical_knowledge_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-clinical_knowledge|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-clinical_knowledge|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-clinical_knowledge|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-clinical_knowledge|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-clinical_knowledge|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_college_biology_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-college_biology|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-college_biology|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-college_biology|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-college_biology|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-college_biology|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-college_biology|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-college_biology|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-college_biology|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-college_biology|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_college_chemistry_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-college_chemistry|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-college_chemistry|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-college_chemistry|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-college_chemistry|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-college_chemistry|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-college_chemistry|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-college_chemistry|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-college_chemistry|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-college_chemistry|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_college_computer_science_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-college_computer_science|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-college_computer_science|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-college_computer_science|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-college_computer_science|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-college_computer_science|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-college_computer_science|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-college_computer_science|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-college_computer_science|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-college_computer_science|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_college_mathematics_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-college_mathematics|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-college_mathematics|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-college_mathematics|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-college_mathematics|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-college_mathematics|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-college_mathematics|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-college_mathematics|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-college_mathematics|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-college_mathematics|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_college_medicine_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-college_medicine|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-college_medicine|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-college_medicine|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-college_medicine|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-college_medicine|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-college_medicine|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-college_medicine|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-college_medicine|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-college_medicine|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_college_physics_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-college_physics|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-college_physics|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-college_physics|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-college_physics|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-college_physics|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-college_physics|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-college_physics|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-college_physics|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-college_physics|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_computer_security_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-computer_security|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-computer_security|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-computer_security|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-computer_security|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-computer_security|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-computer_security|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-computer_security|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-computer_security|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-computer_security|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_conceptual_physics_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-conceptual_physics|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-conceptual_physics|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-conceptual_physics|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-conceptual_physics|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-conceptual_physics|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_econometrics_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-econometrics|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-econometrics|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-econometrics|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-econometrics|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-econometrics|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-econometrics|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-econometrics|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-econometrics|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-econometrics|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_electrical_engineering_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-electrical_engineering|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-electrical_engineering|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-electrical_engineering|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-electrical_engineering|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-electrical_engineering|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_elementary_mathematics_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-elementary_mathematics|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-elementary_mathematics|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-elementary_mathematics|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-elementary_mathematics|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-elementary_mathematics|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_formal_logic_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-formal_logic|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-formal_logic|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-formal_logic|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-formal_logic|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-formal_logic|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-formal_logic|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-formal_logic|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-formal_logic|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-formal_logic|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_global_facts_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-global_facts|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-global_facts|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-global_facts|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-global_facts|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-global_facts|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-global_facts|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-global_facts|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-global_facts|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-global_facts|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_high_school_biology_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-high_school_biology|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-high_school_biology|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-high_school_biology|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-high_school_biology|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-high_school_biology|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-high_school_biology|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-high_school_biology|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-high_school_biology|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_biology|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_high_school_chemistry_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-high_school_chemistry|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-high_school_chemistry|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-high_school_chemistry|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-high_school_chemistry|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_chemistry|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_high_school_computer_science_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-high_school_computer_science|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-high_school_computer_science|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-high_school_computer_science|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-high_school_computer_science|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_computer_science|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_high_school_european_history_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-high_school_european_history|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-high_school_european_history|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-high_school_european_history|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-high_school_european_history|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_european_history|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_high_school_geography_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-high_school_geography|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-high_school_geography|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-high_school_geography|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-high_school_geography|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-high_school_geography|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-high_school_geography|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-high_school_geography|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-high_school_geography|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_geography|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_high_school_government_and_politics_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_high_school_macroeconomics_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_high_school_mathematics_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-high_school_mathematics|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-high_school_mathematics|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-high_school_mathematics|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-high_school_mathematics|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_mathematics|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_high_school_microeconomics_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_high_school_physics_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-high_school_physics|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-high_school_physics|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-high_school_physics|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-high_school_physics|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-high_school_physics|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-high_school_physics|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-high_school_physics|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-high_school_physics|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_physics|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_high_school_psychology_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-high_school_psychology|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-high_school_psychology|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-high_school_psychology|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-high_school_psychology|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_psychology|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_high_school_statistics_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-high_school_statistics|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-high_school_statistics|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-high_school_statistics|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-high_school_statistics|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_statistics|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_high_school_us_history_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-high_school_us_history|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-high_school_us_history|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-high_school_us_history|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-high_school_us_history|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_us_history|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_high_school_world_history_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-high_school_world_history|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-high_school_world_history|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-high_school_world_history|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-high_school_world_history|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_world_history|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_human_aging_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-human_aging|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-human_aging|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-human_aging|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-human_aging|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-human_aging|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-human_aging|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-human_aging|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-human_aging|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-human_aging|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_human_sexuality_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-human_sexuality|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-human_sexuality|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-human_sexuality|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-human_sexuality|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-human_sexuality|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-human_sexuality|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-human_sexuality|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-human_sexuality|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-human_sexuality|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_international_law_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-international_law|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-international_law|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-international_law|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-international_law|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-international_law|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-international_law|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-international_law|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-international_law|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-international_law|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_jurisprudence_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-jurisprudence|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-jurisprudence|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-jurisprudence|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-jurisprudence|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-jurisprudence|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-jurisprudence|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-jurisprudence|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-jurisprudence|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-jurisprudence|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_logical_fallacies_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-logical_fallacies|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-logical_fallacies|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-logical_fallacies|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-logical_fallacies|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-logical_fallacies|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_machine_learning_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-machine_learning|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-machine_learning|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-machine_learning|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-machine_learning|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-machine_learning|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-machine_learning|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-machine_learning|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-machine_learning|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-machine_learning|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_management_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-management|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-management|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-management|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-management|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-management|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-management|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-management|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-management|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-management|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_marketing_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-marketing|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-marketing|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-marketing|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-marketing|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-marketing|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-marketing|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-marketing|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-marketing|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-marketing|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_medical_genetics_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-medical_genetics|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-medical_genetics|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-medical_genetics|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-medical_genetics|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-medical_genetics|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-medical_genetics|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-medical_genetics|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-medical_genetics|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-medical_genetics|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_miscellaneous_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-miscellaneous|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-miscellaneous|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-miscellaneous|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-miscellaneous|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-miscellaneous|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-miscellaneous|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-miscellaneous|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-miscellaneous|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-miscellaneous|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_moral_disputes_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-moral_disputes|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-moral_disputes|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-moral_disputes|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-moral_disputes|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-moral_disputes|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-moral_disputes|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-moral_disputes|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-moral_disputes|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-moral_disputes|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_moral_scenarios_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-moral_scenarios|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-moral_scenarios|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-moral_scenarios|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-moral_scenarios|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-moral_scenarios|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_nutrition_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-nutrition|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-nutrition|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-nutrition|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-nutrition|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-nutrition|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-nutrition|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-nutrition|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-nutrition|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-nutrition|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_philosophy_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-philosophy|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-philosophy|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-philosophy|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-philosophy|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-philosophy|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-philosophy|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-philosophy|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-philosophy|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-philosophy|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_prehistory_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-prehistory|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-prehistory|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-prehistory|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-prehistory|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-prehistory|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-prehistory|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-prehistory|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-prehistory|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-prehistory|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_professional_accounting_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-professional_accounting|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-professional_accounting|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-professional_accounting|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-professional_accounting|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-professional_accounting|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-professional_accounting|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-professional_accounting|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-professional_accounting|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-professional_accounting|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_professional_law_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-professional_law|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-professional_law|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-professional_law|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-professional_law|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-professional_law|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-professional_law|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-professional_law|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-professional_law|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-professional_law|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_professional_medicine_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-professional_medicine|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-professional_medicine|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-professional_medicine|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-professional_medicine|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-professional_medicine|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-professional_medicine|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-professional_medicine|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-professional_medicine|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-professional_medicine|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_professional_psychology_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-professional_psychology|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-professional_psychology|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-professional_psychology|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-professional_psychology|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-professional_psychology|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-professional_psychology|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-professional_psychology|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-professional_psychology|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-professional_psychology|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_public_relations_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-public_relations|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-public_relations|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-public_relations|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-public_relations|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-public_relations|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-public_relations|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-public_relations|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-public_relations|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-public_relations|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_security_studies_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-security_studies|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-security_studies|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-security_studies|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-security_studies|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-security_studies|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-security_studies|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-security_studies|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-security_studies|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-security_studies|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_sociology_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-sociology|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-sociology|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-sociology|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-sociology|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-sociology|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-sociology|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-sociology|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-sociology|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-sociology|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_us_foreign_policy_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-us_foreign_policy|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-us_foreign_policy|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-us_foreign_policy|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-us_foreign_policy|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-us_foreign_policy|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_virology_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-virology|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-virology|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-virology|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-virology|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-virology|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-virology|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-virology|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-virology|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-virology|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_hendrycksTest_world_religions_5 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|hendrycksTest-world_religions|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|hendrycksTest-world_religions|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|hendrycksTest-world_religions|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|hendrycksTest-world_religions|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|hendrycksTest-world_religions|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|hendrycksTest-world_religions|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|hendrycksTest-world_religions|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|hendrycksTest-world_religions|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|hendrycksTest-world_religions|5_2024-03-23T06-18-16.565546.parquet' - config_name: harness_truthfulqa_mc_0 data_files: - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|truthfulqa:mc|0_2023-11-21T18-07-07.067275.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|truthfulqa:mc|0_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|truthfulqa:mc|0_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|truthfulqa:mc|0_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|truthfulqa:mc|0_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|truthfulqa:mc|0_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|truthfulqa:mc|0_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|truthfulqa:mc|0_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|truthfulqa:mc|0_2024-03-23T06-18-16.565546.parquet' - config_name: harness_winogrande_5 data_files: - split: 2023_09_06T15_19_52.414673 path: - '**/details_harness|winogrande|5_2023-09-06T15-19-52.414673.parquet' - split: 2023_09_06T15_22_24.734466 path: - '**/details_harness|winogrande|5_2023-09-06T15-22-24.734466.parquet' - split: 2023_09_06T15_24_04.768979 path: - '**/details_harness|winogrande|5_2023-09-06T15-24-04.768979.parquet' - split: 2023_09_07T12_01_51.839651 path: - '**/details_harness|winogrande|5_2023-09-07T12-01-51.839651.parquet' - split: 2023_09_07T12_04_01.189528 path: - '**/details_harness|winogrande|5_2023-09-07T12-04-01.189528.parquet' - split: 2023_09_07T12_08_17.821371 path: - '**/details_harness|winogrande|5_2023-09-07T12-08-17.821371.parquet' - split: 2023_09_07T12_10_30.286469 path: - '**/details_harness|winogrande|5_2023-09-07T12-10-30.286469.parquet' - split: 2023_11_21T18_07_07.067275 path: - '**/details_harness|winogrande|5_2023-11-21T18-07-07.067275.parquet' - split: 2023_11_29T12_47_35.686694 path: - '**/details_harness|winogrande|5_2023-11-29T12-47-35.686694.parquet' - split: 2023_11_29T12_58_42.860611 path: - '**/details_harness|winogrande|5_2023-11-29T12-58-42.860611.parquet' - split: 2023_12_16T13_32_55.332102 path: - '**/details_harness|winogrande|5_2023-12-16T13-32-55.332102.parquet' - split: 2023_12_19T14_19_42.718116 path: - '**/details_harness|winogrande|5_2023-12-19T14-19-42.718116.parquet' - split: 2023_12_23T15_28_59.872701 path: - '**/details_harness|winogrande|5_2023-12-23T15-28-59.872701.parquet' - split: 2024_01_10T14_42_55.873500 path: - '**/details_harness|winogrande|5_2024-01-10T14-42-55.873500.parquet' - split: 2024_01_18T14_12_21.064569 path: - '**/details_harness|winogrande|5_2024-01-18T14-12-21.064569.parquet' - split: 2024_01_22T13_56_20.291666 path: - '**/details_harness|winogrande|5_2024-01-22T13-56-20.291666.parquet' - split: 2024_03_23T06_18_16.565546 path: - '**/details_harness|winogrande|5_2024-03-23T06-18-16.565546.parquet' - split: latest path: - '**/details_harness|winogrande|5_2024-03-23T06-18-16.565546.parquet' - config_name: results data_files: - split: 2023_09_06T12_19_07.283399 path: - results_2023-09-06T12-19-07.283399.parquet - split: 2023_09_06T12_21_24.071294 path: - results_2023-09-06T12-21-24.071294.parquet - split: 2023_09_06T12_24_13.323279 path: - results_2023-09-06T12-24-13.323279.parquet - split: 2023_09_06T13_26_17.619860 path: - results_2023-09-06T13-26-17.619860.parquet - split: 2023_09_06T15_15_44.379880 path: - results_2023-09-06T15-15-44.379880.parquet - split: 2023_09_06T15_19_52.414673 path: - results_2023-09-06T15-19-52.414673.parquet - split: 2023_09_06T15_22_24.734466 path: - results_2023-09-06T15-22-24.734466.parquet - split: 2023_09_06T15_24_04.768979 path: - results_2023-09-06T15-24-04.768979.parquet - split: 2023_09_07T12_01_51.839651 path: - results_2023-09-07T12-01-51.839651.parquet - split: 2023_09_07T12_04_01.189528 path: - results_2023-09-07T12-04-01.189528.parquet - split: 2023_09_07T12_08_17.821371 path: - results_2023-09-07T12-08-17.821371.parquet - split: 2023_09_07T12_10_30.286469 path: - results_2023-09-07T12-10-30.286469.parquet - split: 2023_09_14T13_54_21.687636 path: - results_2023-09-14T13-54-21.687636.parquet - split: 2023_09_15T12_28_23.937147 path: - results_2023-09-15T12-28-23.937147.parquet - split: 2023_09_15T12_47_31.231445 path: - results_2023-09-15T12-47-31.231445.parquet - split: 2023_11_21T18_07_07.067275 path: - results_2023-11-21T18-07-07.067275.parquet - split: 2023_11_29T12_47_35.686694 path: - results_2023-11-29T12-47-35.686694.parquet - split: 2023_11_29T12_58_42.860611 path: - results_2023-11-29T12-58-42.860611.parquet - split: 2023_12_16T13_32_55.332102 path: - results_2023-12-16T13-32-55.332102.parquet - split: 2023_12_19T14_19_42.718116 path: - results_2023-12-19T14-19-42.718116.parquet - split: 2023_12_23T15_28_59.872701 path: - results_2023-12-23T15-28-59.872701.parquet - split: 2024_01_10T14_42_55.873500 path: - results_2024-01-10T14-42-55.873500.parquet - split: 2024_01_18T14_12_21.064569 path: - results_2024-01-18T14-12-21.064569.parquet - split: 2024_01_22T13_56_20.291666 path: - results_2024-01-22T13-56-20.291666.parquet - split: 2024_03_23T06_18_16.565546 path: - results_2024-03-23T06-18-16.565546.parquet - split: latest path: - results_2024-03-23T06-18-16.565546.parquet --- # Dataset Card for Evaluation run of gpt2 <!-- Provide a quick summary of the dataset. --> Dataset automatically created during the evaluation run of model [gpt2](https://huggingface.co/gpt2) on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard). The dataset is composed of 65 configuration, each one coresponding to one of the evaluated task. The dataset has been created from 25 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to the latest results. An additional configuration "results" store all the aggregated results of the run (and is used to compute and display the aggregated metrics on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)). To load the details from a run, you can for instance do the following: ```python from datasets import load_dataset data = load_dataset("open-llm-leaderboard/details_gpt2", "harness_winogrande_5", split="train") ``` ## Latest results These are the [latest results from run 2024-03-23T06:18:16.565546](https://huggingface.co/datasets/open-llm-leaderboard/details_gpt2/blob/main/results_2024-03-23T06-18-16.565546.json)(note that their might be results for other tasks in the repos if successive evals didn't cover the same tasks. You find each in the results and the "latest" split for each eval): ```python { "all": { "acc": 0.25780579051672486, "acc_stderr": 0.030658881019520554, "acc_norm": 0.2586547713391113, "acc_norm_stderr": 0.031431381356225356, "mc1": 0.22766217870257038, "mc1_stderr": 0.01467925503211107, "mc2": 0.4069116400376613, "mc2_stderr": 0.014934250122346554 }, "harness|arc:challenge|25": { "acc": 0.197098976109215, "acc_stderr": 0.011625047669880633, "acc_norm": 0.22013651877133106, "acc_norm_stderr": 0.01210812488346097 }, "harness|hellaswag|10": { "acc": 0.29267078271260705, "acc_stderr": 0.004540586983229993, "acc_norm": 0.3152758414658435, "acc_norm_stderr": 0.0046367607625228515 }, "harness|hendrycksTest-abstract_algebra|5": { "acc": 0.21, "acc_stderr": 0.040936018074033256, "acc_norm": 0.21, "acc_norm_stderr": 0.040936018074033256 }, "harness|hendrycksTest-anatomy|5": { "acc": 0.22962962962962963, "acc_stderr": 0.03633384414073462, "acc_norm": 0.22962962962962963, "acc_norm_stderr": 0.03633384414073462 }, "harness|hendrycksTest-astronomy|5": { "acc": 0.16447368421052633, "acc_stderr": 0.0301675334686327, "acc_norm": 0.16447368421052633, "acc_norm_stderr": 0.0301675334686327 }, "harness|hendrycksTest-business_ethics|5": { "acc": 0.17, "acc_stderr": 0.0377525168068637, "acc_norm": 0.17, "acc_norm_stderr": 0.0377525168068637 }, "harness|hendrycksTest-clinical_knowledge|5": { "acc": 0.24150943396226415, "acc_stderr": 0.026341480371118345, "acc_norm": 0.24150943396226415, "acc_norm_stderr": 0.026341480371118345 }, "harness|hendrycksTest-college_biology|5": { "acc": 0.2222222222222222, "acc_stderr": 0.03476590104304134, "acc_norm": 0.2222222222222222, "acc_norm_stderr": 0.03476590104304134 }, "harness|hendrycksTest-college_chemistry|5": { "acc": 0.2, "acc_stderr": 0.04020151261036846, "acc_norm": 0.2, "acc_norm_stderr": 0.04020151261036846 }, "harness|hendrycksTest-college_computer_science|5": { "acc": 0.28, "acc_stderr": 0.04512608598542128, "acc_norm": 0.28, "acc_norm_stderr": 0.04512608598542128 }, "harness|hendrycksTest-college_mathematics|5": { "acc": 0.3, "acc_stderr": 0.046056618647183814, "acc_norm": 0.3, "acc_norm_stderr": 0.046056618647183814 }, "harness|hendrycksTest-college_medicine|5": { "acc": 0.24277456647398843, "acc_stderr": 0.0326926380614177, "acc_norm": 0.24277456647398843, "acc_norm_stderr": 0.0326926380614177 }, "harness|hendrycksTest-college_physics|5": { "acc": 0.2549019607843137, "acc_stderr": 0.043364327079931785, "acc_norm": 0.2549019607843137, "acc_norm_stderr": 0.043364327079931785 }, "harness|hendrycksTest-computer_security|5": { "acc": 0.16, "acc_stderr": 0.03684529491774709, "acc_norm": 0.16, "acc_norm_stderr": 0.03684529491774709 }, "harness|hendrycksTest-conceptual_physics|5": { "acc": 0.2723404255319149, "acc_stderr": 0.029101290698386698, "acc_norm": 0.2723404255319149, "acc_norm_stderr": 0.029101290698386698 }, "harness|hendrycksTest-econometrics|5": { "acc": 0.2631578947368421, "acc_stderr": 0.041424397194893624, "acc_norm": 0.2631578947368421, "acc_norm_stderr": 0.041424397194893624 }, "harness|hendrycksTest-electrical_engineering|5": { "acc": 0.2413793103448276, "acc_stderr": 0.03565998174135302, "acc_norm": 0.2413793103448276, "acc_norm_stderr": 0.03565998174135302 }, "harness|hendrycksTest-elementary_mathematics|5": { "acc": 0.25396825396825395, "acc_stderr": 0.022418042891113942, "acc_norm": 0.25396825396825395, "acc_norm_stderr": 0.022418042891113942 }, "harness|hendrycksTest-formal_logic|5": { "acc": 0.14285714285714285, "acc_stderr": 0.0312984318574381, "acc_norm": 0.14285714285714285, "acc_norm_stderr": 0.0312984318574381 }, "harness|hendrycksTest-global_facts|5": { "acc": 0.15, "acc_stderr": 0.035887028128263686, "acc_norm": 0.15, "acc_norm_stderr": 0.035887028128263686 }, "harness|hendrycksTest-high_school_biology|5": { "acc": 0.2967741935483871, "acc_stderr": 0.025988500792411894, "acc_norm": 0.2967741935483871, "acc_norm_stderr": 0.025988500792411894 }, "harness|hendrycksTest-high_school_chemistry|5": { "acc": 0.270935960591133, "acc_stderr": 0.03127090713297698, "acc_norm": 0.270935960591133, "acc_norm_stderr": 0.03127090713297698 }, "harness|hendrycksTest-high_school_computer_science|5": { "acc": 0.26, "acc_stderr": 0.04408440022768079, "acc_norm": 0.26, "acc_norm_stderr": 0.04408440022768079 }, "harness|hendrycksTest-high_school_european_history|5": { "acc": 0.21818181818181817, "acc_stderr": 0.03225078108306289, "acc_norm": 0.21818181818181817, "acc_norm_stderr": 0.03225078108306289 }, "harness|hendrycksTest-high_school_geography|5": { "acc": 0.35353535353535354, "acc_stderr": 0.03406086723547153, "acc_norm": 0.35353535353535354, "acc_norm_stderr": 0.03406086723547153 }, "harness|hendrycksTest-high_school_government_and_politics|5": { "acc": 0.36787564766839376, "acc_stderr": 0.03480175668466036, "acc_norm": 0.36787564766839376, "acc_norm_stderr": 0.03480175668466036 }, "harness|hendrycksTest-high_school_macroeconomics|5": { "acc": 0.2717948717948718, "acc_stderr": 0.022556551010132358, "acc_norm": 0.2717948717948718, "acc_norm_stderr": 0.022556551010132358 }, "harness|hendrycksTest-high_school_mathematics|5": { "acc": 0.26296296296296295, "acc_stderr": 0.026842057873833706, "acc_norm": 0.26296296296296295, "acc_norm_stderr": 0.026842057873833706 }, "harness|hendrycksTest-high_school_microeconomics|5": { "acc": 0.28991596638655465, "acc_stderr": 0.029472485833136098, "acc_norm": 0.28991596638655465, "acc_norm_stderr": 0.029472485833136098 }, "harness|hendrycksTest-high_school_physics|5": { "acc": 0.271523178807947, "acc_stderr": 0.03631329803969654, "acc_norm": 0.271523178807947, "acc_norm_stderr": 0.03631329803969654 }, "harness|hendrycksTest-high_school_psychology|5": { "acc": 0.3486238532110092, "acc_stderr": 0.020431254090714328, "acc_norm": 0.3486238532110092, "acc_norm_stderr": 0.020431254090714328 }, "harness|hendrycksTest-high_school_statistics|5": { "acc": 0.4722222222222222, "acc_stderr": 0.0340470532865388, "acc_norm": 0.4722222222222222, "acc_norm_stderr": 0.0340470532865388 }, "harness|hendrycksTest-high_school_us_history|5": { "acc": 0.25, "acc_stderr": 0.03039153369274154, "acc_norm": 0.25, "acc_norm_stderr": 0.03039153369274154 }, "harness|hendrycksTest-high_school_world_history|5": { "acc": 0.24472573839662448, "acc_stderr": 0.027985699387036416, "acc_norm": 0.24472573839662448, "acc_norm_stderr": 0.027985699387036416 }, "harness|hendrycksTest-human_aging|5": { "acc": 0.2914798206278027, "acc_stderr": 0.030500283176545923, "acc_norm": 0.2914798206278027, "acc_norm_stderr": 0.030500283176545923 }, "harness|hendrycksTest-human_sexuality|5": { "acc": 0.26717557251908397, "acc_stderr": 0.038808483010823944, "acc_norm": 0.26717557251908397, "acc_norm_stderr": 0.038808483010823944 }, "harness|hendrycksTest-international_law|5": { "acc": 0.32231404958677684, "acc_stderr": 0.04266416363352168, "acc_norm": 0.32231404958677684, "acc_norm_stderr": 0.04266416363352168 }, "harness|hendrycksTest-jurisprudence|5": { "acc": 0.21296296296296297, "acc_stderr": 0.03957835471980981, "acc_norm": 0.21296296296296297, "acc_norm_stderr": 0.03957835471980981 }, "harness|hendrycksTest-logical_fallacies|5": { "acc": 0.26380368098159507, "acc_stderr": 0.03462419931615623, "acc_norm": 0.26380368098159507, "acc_norm_stderr": 0.03462419931615623 }, "harness|hendrycksTest-machine_learning|5": { "acc": 0.25892857142857145, "acc_stderr": 0.041577515398656284, "acc_norm": 0.25892857142857145, "acc_norm_stderr": 0.041577515398656284 }, "harness|hendrycksTest-management|5": { "acc": 0.34951456310679613, "acc_stderr": 0.04721188506097173, "acc_norm": 0.34951456310679613, "acc_norm_stderr": 0.04721188506097173 }, "harness|hendrycksTest-marketing|5": { "acc": 0.1794871794871795, "acc_stderr": 0.025140935950335418, "acc_norm": 0.1794871794871795, "acc_norm_stderr": 0.025140935950335418 }, "harness|hendrycksTest-medical_genetics|5": { "acc": 0.27, "acc_stderr": 0.044619604333847394, "acc_norm": 0.27, "acc_norm_stderr": 0.044619604333847394 }, "harness|hendrycksTest-miscellaneous|5": { "acc": 0.21583652618135377, "acc_stderr": 0.014711684386139958, "acc_norm": 0.21583652618135377, "acc_norm_stderr": 0.014711684386139958 }, "harness|hendrycksTest-moral_disputes|5": { "acc": 0.24277456647398843, "acc_stderr": 0.0230836585869842, "acc_norm": 0.24277456647398843, "acc_norm_stderr": 0.0230836585869842 }, "harness|hendrycksTest-moral_scenarios|5": { "acc": 0.2424581005586592, "acc_stderr": 0.014333522059217889, "acc_norm": 0.2424581005586592, "acc_norm_stderr": 0.014333522059217889 }, "harness|hendrycksTest-nutrition|5": { "acc": 0.21895424836601307, "acc_stderr": 0.02367908986180772, "acc_norm": 0.21895424836601307, "acc_norm_stderr": 0.02367908986180772 }, "harness|hendrycksTest-philosophy|5": { "acc": 0.24758842443729903, "acc_stderr": 0.024513879973621967, "acc_norm": 0.24758842443729903, "acc_norm_stderr": 0.024513879973621967 }, "harness|hendrycksTest-prehistory|5": { "acc": 0.22530864197530864, "acc_stderr": 0.023246202647819746, "acc_norm": 0.22530864197530864, "acc_norm_stderr": 0.023246202647819746 }, "harness|hendrycksTest-professional_accounting|5": { "acc": 0.26595744680851063, "acc_stderr": 0.026358065698880592, "acc_norm": 0.26595744680851063, "acc_norm_stderr": 0.026358065698880592 }, "harness|hendrycksTest-professional_law|5": { "acc": 0.2457627118644068, "acc_stderr": 0.010996156635142692, "acc_norm": 0.2457627118644068, "acc_norm_stderr": 0.010996156635142692 }, "harness|hendrycksTest-professional_medicine|5": { "acc": 0.44485294117647056, "acc_stderr": 0.030187532060329376, "acc_norm": 0.44485294117647056, "acc_norm_stderr": 0.030187532060329376 }, "harness|hendrycksTest-professional_psychology|5": { "acc": 0.26143790849673204, "acc_stderr": 0.017776947157528034, "acc_norm": 0.26143790849673204, "acc_norm_stderr": 0.017776947157528034 }, "harness|hendrycksTest-public_relations|5": { "acc": 0.21818181818181817, "acc_stderr": 0.03955932861795833, "acc_norm": 0.21818181818181817, "acc_norm_stderr": 0.03955932861795833 }, "harness|hendrycksTest-security_studies|5": { "acc": 0.4, "acc_stderr": 0.031362502409358936, "acc_norm": 0.4, "acc_norm_stderr": 0.031362502409358936 }, "harness|hendrycksTest-sociology|5": { "acc": 0.22885572139303484, "acc_stderr": 0.029705284056772426, "acc_norm": 0.22885572139303484, "acc_norm_stderr": 0.029705284056772426 }, "harness|hendrycksTest-us_foreign_policy|5": { "acc": 0.27, "acc_stderr": 0.04461960433384739, "acc_norm": 0.27, "acc_norm_stderr": 0.04461960433384739 }, "harness|hendrycksTest-virology|5": { "acc": 0.1927710843373494, "acc_stderr": 0.030709824050565274, "acc_norm": 0.1927710843373494, "acc_norm_stderr": 0.030709824050565274 }, "harness|hendrycksTest-world_religions|5": { "acc": 0.21052631578947367, "acc_stderr": 0.0312678171466318, "acc_norm": 0.21052631578947367, "acc_norm_stderr": 0.0312678171466318 }, "harness|truthfulqa:mc|0": { "mc1": 0.22766217870257038, "mc1_stderr": 0.01467925503211107, "mc2": 0.4069116400376613, "mc2_stderr": 0.014934250122346554 }, "harness|winogrande|5": { "acc": 0.5043409629044988, "acc_stderr": 0.014051956064076887 }, "harness|gsm8k|5": { "acc": 0.006823351023502654, "acc_stderr": 0.0022675371022544736 } } ``` ## Dataset Details ### Dataset Description <!-- Provide a longer summary of what this dataset is. --> - **Curated by:** [More Information Needed] - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] ### Dataset Sources [optional] <!-- Provide the basic links for the dataset. --> - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses <!-- Address questions around how the dataset is intended to be used. --> ### Direct Use <!-- This section describes suitable use cases for the dataset. --> [More Information Needed] ### Out-of-Scope Use <!-- This section addresses misuse, malicious use, and uses that the dataset will not work well for. --> [More Information Needed] ## Dataset Structure <!-- This section provides a description of the dataset fields, and additional information about the dataset structure such as criteria used to create the splits, relationships between data points, etc. --> [More Information Needed] ## Dataset Creation ### Curation Rationale <!-- Motivation for the creation of this dataset. --> [More Information Needed] ### Source Data <!-- This section describes the source data (e.g. news text and headlines, social media posts, translated sentences, ...). --> #### Data Collection and Processing <!-- This section describes the data collection and processing process such as data selection criteria, filtering and normalization methods, tools and libraries used, etc. --> [More Information Needed] #### Who are the source data producers? <!-- This section describes the people or systems who originally created the data. It should also include self-reported demographic or identity information for the source data creators if this information is available. --> [More Information Needed] ### Annotations [optional] <!-- If the dataset contains annotations which are not part of the initial data collection, use this section to describe them. --> #### Annotation process <!-- This section describes the annotation process such as annotation tools used in the process, the amount of data annotated, annotation guidelines provided to the annotators, interannotator statistics, annotation validation, etc. --> [More Information Needed] #### Who are the annotators? <!-- This section describes the people or systems who created the annotations. --> [More Information Needed] #### Personal and Sensitive Information <!-- State whether the dataset contains data that might be considered personal, sensitive, or private (e.g., data that reveals addresses, uniquely identifiable names or aliases, racial or ethnic origins, sexual orientations, religious beliefs, political opinions, financial or health data, etc.). If efforts were made to anonymize the data, describe the anonymization process. --> [More Information Needed] ## Bias, Risks, and Limitations <!-- This section is meant to convey both technical and sociotechnical limitations. --> [More Information Needed] ### Recommendations <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. --> Users should be made aware of the risks, biases and limitations of the dataset. More information needed for further recommendations. ## Citation [optional] <!-- If there is a paper or blog post introducing the dataset, the APA and Bibtex information for that should go in this section. --> **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] <!-- If relevant, include terms and calculations in this section that can help readers understand the dataset or dataset card. --> [More Information Needed] ## More Information [optional] [More Information Needed] ## Dataset Card Authors [optional] [More Information Needed] ## Dataset Card Contact [More Information Needed]
提供机构:
open-llm-leaderboard-old
原始信息汇总

数据集概述

数据集名称

  • 名称: Evaluation run of gpt2

数据集组成

  • 配置数量: 65个配置,每个配置对应一个评估任务。
  • 运行次数: 数据集从25次运行中创建。每个运行在每个配置中作为一个特定的分割存在,分割名称使用运行的时间戳。
  • 训练分割: "train"分割总是指向最新的结果。
  • 结果配置: 一个额外的配置"results"存储所有运行的聚合结果,用于计算和显示在Open LLM Leaderboard上的聚合指标。

数据加载示例

python from datasets import load_dataset data = load_dataset("open-llm-leaderboard/details_gpt2", "harness_winogrande_5", split="train")

最新结果

  • 最新结果时间戳: 2024-03-23T06:18:16.565546
  • 结果示例: python { "all": { "acc": 0.25780579051672486, "acc_stderr": 0.030658881019520554, "acc_norm": 0.2586547713391113, "acc_norm_stderr": 0.031431381356225356, "mc1": 0.22766217870257038, "mc1_stderr": 0.01467925503211107, "mc2": 0.4069116400376613, "mc2_stderr": 0.014934250122346554 }, "harness|arc:challenge|25": { "acc": 0.197098976109215, "acc_stderr": 0.011625047669880633, "acc_norm": 0.22013651877133106, "acc_norm_stderr": 0.01210812488346097 }, "harness|hellaswag|10": { "acc": 0.29267078271260705, "acc_stderr": 0.004540586983229993, "acc_norm": 0.3152758414658435, "acc_norm_stderr": 0.0046367607625228515 }, "harness|hendrycksTest-abstract_algebra|5": { "acc": 0.21, "acc_stderr": 0.040936018074033256, "acc_norm": 0.21, "acc_norm_stderr": 0.040936018074033256 }, "harness|hendrycksTest-anatomy|5": { "acc": 0.22962962962962963, "acc_stderr": 0.03633384414073462, "acc_norm": 0.22962962962962963, "acc_norm_stderr": 0.03633384414073462 }, "harness|hendrycksTest-astronomy|5": { "acc": 0.16447368421052633, "acc_stderr": 0.0301675334686327, "acc_norm": 0.16447368421052633, "acc_norm_stderr": 0.0301675334686327 }, "harness|hendrycksTest-business_ethics|5": { "acc": 0.17, "acc_stderr": 0.0377525168068637, "acc_norm": 0.17, "acc_norm_stderr": 0.0377525168068637 }, "harness|hendrycksTest-clinical_knowledge|5": { "acc": 0.24150943396226415, "acc_stderr": 0.026341480371118345, "acc_norm": 0.24150943396226415, "acc_norm_stderr": 0.026341480371118345 }, "harness|hendrycksTest-college_biology|5": { "acc": 0.2222222222222222, "acc_stderr": 0.03476590104304134, "acc_norm": 0.2222222222222222, "acc_norm_stderr": 0.03476590104304134 }, "harness|hendrycksTest-college_chemistry|5": { "acc": 0.2, "acc_stderr": 0.04020151261036846, "acc_norm": 0.2, "acc_norm_stderr": 0.04020151261036846 }, "harness|hendrycksTest-college_computer_science|5": { "acc": 0.28, "acc_stderr": 0.04512608598542128, "acc_norm": 0.28, "acc_norm_stderr": 0.04512608598542128 }, "harness|hendrycksTest-college_mathematics|5": { "acc": 0.3, "acc_stderr": 0.046056618647183814, "acc_norm": 0.3, "acc_norm_stderr": 0.046056618647183814 }, "harness|hendrycksTest-college_medicine|5": { "acc": 0.24277456647398843, "acc_stderr": 0.0326926380614177, "acc_norm": 0.24277456647398843, "acc_norm_stderr": 0.0326926380614177 }, "harness|hendrycksTest-college_physics|5": { "acc": 0.2549019607843137, "acc_stderr": 0.043364327079931785, "acc_norm": 0.2549019607843137, "acc_norm_stderr": 0.043364327079931785 }, "harness|hendrycksTest-computer_security|5": { "acc": 0.16, "acc_stderr": 0.03684529491774709, "acc_norm": 0.16, "acc_norm_stderr": 0.03684529491774709 }, "harness|hendrycksTest-conceptual_physics|5": { "acc": 0.2723404255319149, "acc_stderr": 0.029101290698386698, "acc_norm": 0.2723404255319149, "acc_norm_stderr": 0.029101290698386698 }, "harness|hendrycksTest-econometrics|5": { "acc": 0.2631578947368421, "acc_stderr": 0.041424397194893624, "acc_norm": 0.2631578947368421, "acc_norm_stderr": 0.041424397194893624 }, "harness|hendrycksTest-electrical_engineering|5": { "acc": 0.2413793103448276, "acc_stderr": 0.03565998174135302, "acc_norm": 0.2413793103448276, "acc_norm_stderr": 0.03565998174135302 }, "harness|hendrycksTest-elementary_mathematics|5": { "acc": 0.25396825396825395, "acc_stderr": 0.022418042891113942, "acc_norm": 0.25396825396825395, "acc_norm_stderr": 0.022418042891113942 }, "harness|hendrycksTest-formal_logic|5": { "acc": 0.14285714285714285, "acc_stderr": 0.0312984318574381, "acc_norm": 0.14285714285714285, "acc_norm_stderr": 0.0312984318574381 }, "harness|hendrycksTest-global_facts|5": { "acc": 0.15, "acc_stderr": 0.035887028128263686, "acc_norm": 0.15, "acc_norm_stderr": 0.035887028128263686 }, "harness|hendrycksTest-high_school_biology|5": { "acc": 0.2967741935483871, "acc_stderr": 0.025988500792411894, "acc_norm": 0.2967741935483871, "acc_norm_stderr": 0.025988500792411894 }, "harness|hendrycksTest-high_school_chemistry|5": { "acc": 0.270935960591133, "acc_stderr": 0.03127090713297698, "acc_norm": 0.270935960591133, "acc_norm_stderr": 0.03127090713297698 }, "harness|hendrycksTest-high_school_computer_science|5": { "acc": 0.26, "acc_stderr": 0.04408440022768079, "acc_norm": 0.26, "acc_norm_stderr": 0.04408440022768079 }, "harness|hendrycksTest-high_school_european_history|5": { "acc": 0.21818181818181817, "acc_stderr": 0.03225078108306289, "acc_norm": 0.21818181818181817, "acc_norm_stderr": 0.03225078108306289 }, "harness|hendrycksTest-high_school_geography|5": { "acc": 0.35353535353535354, "acc_stderr": 0.03406086723547153, "acc_norm": 0.35353535353535354, "acc_norm_stderr": 0.03406086723547153 }, "harness|hendrycksTest-high_school_government_and_politics|5": { "acc": 0.36787564766839376, "acc_stderr": 0.03480175668466036, "acc_norm": 0.36787564766839376, "acc_norm_stderr": 0.03480175668466036 }, "harness|hendrycksTest-high_school_macroeconomics|5": { "acc": 0.2717948717948718, "acc_stderr": 0.022556551010132358, "acc_norm": 0.2717948717948718, "acc_norm_stderr": 0.022556551010132358 }, "harness|hendrycksTest-high_school_mathematics|5": { "acc": 0.26296296296296295, "acc_stderr": 0.026842057873833706, "acc_norm": 0.26296296296296295, "acc_norm_stderr": 0.026842057873833706 }, "harness|hendrycksTest-high_school_microeconomics|5": { "acc": 0.28991596638655465, "acc_stderr": 0.029472485833136098, "acc_norm": 0.28991596638655465,
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作