five

open-llm-leaderboard/details_AIDC-ai-business__Marcoroni-70B

收藏
Hugging Face2023-09-19 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/open-llm-leaderboard/details_AIDC-ai-business__Marcoroni-70B
下载链接
链接失效反馈
官方服务:
资源简介:
--- pretty_name: Evaluation run of AIDC-ai-business/Marcoroni-70B dataset_summary: "Dataset automatically created during the evaluation run of model\ \ [AIDC-ai-business/Marcoroni-70B](https://huggingface.co/AIDC-ai-business/Marcoroni-70B)\ \ on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).\n\ \nThe dataset is composed of 61 configuration, each one coresponding to one of the\ \ evaluated task.\n\nThe dataset has been created from 4 run(s). Each run can be\ \ found as a specific split in each configuration, the split being named using the\ \ timestamp of the run.The \"train\" split is always pointing to the latest results.\n\ \nAn additional configuration \"results\" store all the aggregated results of the\ \ run (and is used to compute and display the agregated metrics on the [Open LLM\ \ Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)).\n\ \nTo load the details from a run, you can for instance do the following:\n```python\n\ from datasets import load_dataset\ndata = load_dataset(\"open-llm-leaderboard/details_AIDC-ai-business__Marcoroni-70B\"\ ,\n\t\"harness_truthfulqa_mc_0\",\n\tsplit=\"train\")\n```\n\n## Latest results\n\ \nThese are the [latest results from run 2023-09-19T02:16:50.789886](https://huggingface.co/datasets/open-llm-leaderboard/details_AIDC-ai-business__Marcoroni-70B/blob/main/results_2023-09-19T02-16-50.789886.json)(note\ \ that their might be results for other tasks in the repos if successive evals didn't\ \ cover the same tasks. You find each in the results and the \"latest\" split for\ \ each eval):\n\n```python\n{\n \"all\": {\n \"acc\": 0.23992448312110085,\n\ \ \"acc_stderr\": 0.031078389352549952,\n \"acc_norm\": 0.24054395860756556,\n\ \ \"acc_norm_stderr\": 0.03108725267744147,\n \"mc1\": 1.0,\n \ \ \"mc1_stderr\": 0.0,\n \"mc2\": NaN,\n \"mc2_stderr\": NaN\n\ \ },\n \"harness|arc:challenge|25\": {\n \"acc\": 0.24744027303754265,\n\ \ \"acc_stderr\": 0.012610352663292673,\n \"acc_norm\": 0.2790102389078498,\n\ \ \"acc_norm_stderr\": 0.013106784883601346\n },\n \"harness|hellaswag|10\"\ : {\n \"acc\": 0.2621987651862179,\n \"acc_stderr\": 0.004389312748012152,\n\ \ \"acc_norm\": 0.2671778530173272,\n \"acc_norm_stderr\": 0.004415816696303084\n\ \ },\n \"harness|hendrycksTest-abstract_algebra|5\": {\n \"acc\": 0.19,\n\ \ \"acc_stderr\": 0.03942772444036624,\n \"acc_norm\": 0.19,\n \ \ \"acc_norm_stderr\": 0.03942772444036624\n },\n \"harness|hendrycksTest-anatomy|5\"\ : {\n \"acc\": 0.18518518518518517,\n \"acc_stderr\": 0.0335567721631314,\n\ \ \"acc_norm\": 0.18518518518518517,\n \"acc_norm_stderr\": 0.0335567721631314\n\ \ },\n \"harness|hendrycksTest-astronomy|5\": {\n \"acc\": 0.21710526315789475,\n\ \ \"acc_stderr\": 0.033550453048829226,\n \"acc_norm\": 0.21710526315789475,\n\ \ \"acc_norm_stderr\": 0.033550453048829226\n },\n \"harness|hendrycksTest-business_ethics|5\"\ : {\n \"acc\": 0.29,\n \"acc_stderr\": 0.04560480215720684,\n \ \ \"acc_norm\": 0.29,\n \"acc_norm_stderr\": 0.04560480215720684\n \ \ },\n \"harness|hendrycksTest-clinical_knowledge|5\": {\n \"acc\": 0.20754716981132076,\n\ \ \"acc_stderr\": 0.02495991802891127,\n \"acc_norm\": 0.20754716981132076,\n\ \ \"acc_norm_stderr\": 0.02495991802891127\n },\n \"harness|hendrycksTest-college_biology|5\"\ : {\n \"acc\": 0.2638888888888889,\n \"acc_stderr\": 0.03685651095897532,\n\ \ \"acc_norm\": 0.2638888888888889,\n \"acc_norm_stderr\": 0.03685651095897532\n\ \ },\n \"harness|hendrycksTest-college_chemistry|5\": {\n \"acc\":\ \ 0.19,\n \"acc_stderr\": 0.039427724440366234,\n \"acc_norm\": 0.19,\n\ \ \"acc_norm_stderr\": 0.039427724440366234\n },\n \"harness|hendrycksTest-college_computer_science|5\"\ : {\n \"acc\": 0.29,\n \"acc_stderr\": 0.045604802157206845,\n \ \ \"acc_norm\": 0.29,\n \"acc_norm_stderr\": 0.045604802157206845\n \ \ },\n \"harness|hendrycksTest-college_mathematics|5\": {\n \"acc\"\ : 0.22,\n \"acc_stderr\": 0.04163331998932269,\n \"acc_norm\": 0.22,\n\ \ \"acc_norm_stderr\": 0.04163331998932269\n },\n \"harness|hendrycksTest-college_medicine|5\"\ : {\n \"acc\": 0.18497109826589594,\n \"acc_stderr\": 0.029605623981771204,\n\ \ \"acc_norm\": 0.18497109826589594,\n \"acc_norm_stderr\": 0.029605623981771204\n\ \ },\n \"harness|hendrycksTest-college_physics|5\": {\n \"acc\": 0.22549019607843138,\n\ \ \"acc_stderr\": 0.041583075330832865,\n \"acc_norm\": 0.22549019607843138,\n\ \ \"acc_norm_stderr\": 0.041583075330832865\n },\n \"harness|hendrycksTest-computer_security|5\"\ : {\n \"acc\": 0.25,\n \"acc_stderr\": 0.04351941398892446,\n \ \ \"acc_norm\": 0.25,\n \"acc_norm_stderr\": 0.04351941398892446\n \ \ },\n \"harness|hendrycksTest-conceptual_physics|5\": {\n \"acc\": 0.2723404255319149,\n\ \ \"acc_stderr\": 0.029101290698386705,\n \"acc_norm\": 0.2723404255319149,\n\ \ \"acc_norm_stderr\": 0.029101290698386705\n },\n \"harness|hendrycksTest-econometrics|5\"\ : {\n \"acc\": 0.2807017543859649,\n \"acc_stderr\": 0.042270544512322,\n\ \ \"acc_norm\": 0.2807017543859649,\n \"acc_norm_stderr\": 0.042270544512322\n\ \ },\n \"harness|hendrycksTest-electrical_engineering|5\": {\n \"acc\"\ : 0.25517241379310346,\n \"acc_stderr\": 0.03632984052707842,\n \"\ acc_norm\": 0.25517241379310346,\n \"acc_norm_stderr\": 0.03632984052707842\n\ \ },\n \"harness|hendrycksTest-elementary_mathematics|5\": {\n \"acc\"\ : 0.24603174603174602,\n \"acc_stderr\": 0.022182037202948365,\n \"\ acc_norm\": 0.24603174603174602,\n \"acc_norm_stderr\": 0.022182037202948365\n\ \ },\n \"harness|hendrycksTest-formal_logic|5\": {\n \"acc\": 0.2777777777777778,\n\ \ \"acc_stderr\": 0.04006168083848876,\n \"acc_norm\": 0.2777777777777778,\n\ \ \"acc_norm_stderr\": 0.04006168083848876\n },\n \"harness|hendrycksTest-global_facts|5\"\ : {\n \"acc\": 0.21,\n \"acc_stderr\": 0.040936018074033256,\n \ \ \"acc_norm\": 0.21,\n \"acc_norm_stderr\": 0.040936018074033256\n \ \ },\n \"harness|hendrycksTest-high_school_biology|5\": {\n \"acc\"\ : 0.2,\n \"acc_stderr\": 0.022755204959542932,\n \"acc_norm\": 0.2,\n\ \ \"acc_norm_stderr\": 0.022755204959542932\n },\n \"harness|hendrycksTest-high_school_chemistry|5\"\ : {\n \"acc\": 0.22167487684729065,\n \"acc_stderr\": 0.029225575892489607,\n\ \ \"acc_norm\": 0.22167487684729065,\n \"acc_norm_stderr\": 0.029225575892489607\n\ \ },\n \"harness|hendrycksTest-high_school_computer_science|5\": {\n \ \ \"acc\": 0.21,\n \"acc_stderr\": 0.04093601807403326,\n \"acc_norm\"\ : 0.21,\n \"acc_norm_stderr\": 0.04093601807403326\n },\n \"harness|hendrycksTest-high_school_european_history|5\"\ : {\n \"acc\": 0.2545454545454545,\n \"acc_stderr\": 0.0340150671524904,\n\ \ \"acc_norm\": 0.2545454545454545,\n \"acc_norm_stderr\": 0.0340150671524904\n\ \ },\n \"harness|hendrycksTest-high_school_geography|5\": {\n \"acc\"\ : 0.20707070707070707,\n \"acc_stderr\": 0.02886977846026705,\n \"\ acc_norm\": 0.20707070707070707,\n \"acc_norm_stderr\": 0.02886977846026705\n\ \ },\n \"harness|hendrycksTest-high_school_government_and_politics|5\": {\n\ \ \"acc\": 0.2849740932642487,\n \"acc_stderr\": 0.03257714077709661,\n\ \ \"acc_norm\": 0.2849740932642487,\n \"acc_norm_stderr\": 0.03257714077709661\n\ \ },\n \"harness|hendrycksTest-high_school_macroeconomics|5\": {\n \ \ \"acc\": 0.23333333333333334,\n \"acc_stderr\": 0.021444547301560486,\n\ \ \"acc_norm\": 0.23333333333333334,\n \"acc_norm_stderr\": 0.021444547301560486\n\ \ },\n \"harness|hendrycksTest-high_school_mathematics|5\": {\n \"\ acc\": 0.22592592592592592,\n \"acc_stderr\": 0.025497532639609542,\n \ \ \"acc_norm\": 0.22592592592592592,\n \"acc_norm_stderr\": 0.025497532639609542\n\ \ },\n \"harness|hendrycksTest-high_school_microeconomics|5\": {\n \ \ \"acc\": 0.19747899159663865,\n \"acc_stderr\": 0.025859164122051467,\n\ \ \"acc_norm\": 0.19747899159663865,\n \"acc_norm_stderr\": 0.025859164122051467\n\ \ },\n \"harness|hendrycksTest-high_school_physics|5\": {\n \"acc\"\ : 0.23841059602649006,\n \"acc_stderr\": 0.0347918557259966,\n \"\ acc_norm\": 0.23841059602649006,\n \"acc_norm_stderr\": 0.0347918557259966\n\ \ },\n \"harness|hendrycksTest-high_school_psychology|5\": {\n \"acc\"\ : 0.21651376146788992,\n \"acc_stderr\": 0.017658710594443145,\n \"\ acc_norm\": 0.21651376146788992,\n \"acc_norm_stderr\": 0.017658710594443145\n\ \ },\n \"harness|hendrycksTest-high_school_statistics|5\": {\n \"acc\"\ : 0.1712962962962963,\n \"acc_stderr\": 0.025695341643824685,\n \"\ acc_norm\": 0.1712962962962963,\n \"acc_norm_stderr\": 0.025695341643824685\n\ \ },\n \"harness|hendrycksTest-high_school_us_history|5\": {\n \"acc\"\ : 0.25,\n \"acc_stderr\": 0.03039153369274154,\n \"acc_norm\": 0.25,\n\ \ \"acc_norm_stderr\": 0.03039153369274154\n },\n \"harness|hendrycksTest-high_school_world_history|5\"\ : {\n \"acc\": 0.25738396624472576,\n \"acc_stderr\": 0.028458820991460302,\n\ \ \"acc_norm\": 0.25738396624472576,\n \"acc_norm_stderr\": 0.028458820991460302\n\ \ },\n \"harness|hendrycksTest-human_aging|5\": {\n \"acc\": 0.2914798206278027,\n\ \ \"acc_stderr\": 0.030500283176545902,\n \"acc_norm\": 0.2914798206278027,\n\ \ \"acc_norm_stderr\": 0.030500283176545902\n },\n \"harness|hendrycksTest-human_sexuality|5\"\ : {\n \"acc\": 0.24427480916030533,\n \"acc_stderr\": 0.037683359597287434,\n\ \ \"acc_norm\": 0.24427480916030533,\n \"acc_norm_stderr\": 0.037683359597287434\n\ \ },\n \"harness|hendrycksTest-international_law|5\": {\n \"acc\":\ \ 0.2809917355371901,\n \"acc_stderr\": 0.04103203830514511,\n \"\ acc_norm\": 0.2809917355371901,\n \"acc_norm_stderr\": 0.04103203830514511\n\ \ },\n \"harness|hendrycksTest-jurisprudence|5\": {\n \"acc\": 0.25925925925925924,\n\ \ \"acc_stderr\": 0.042365112580946336,\n \"acc_norm\": 0.25925925925925924,\n\ \ \"acc_norm_stderr\": 0.042365112580946336\n },\n \"harness|hendrycksTest-logical_fallacies|5\"\ : {\n \"acc\": 0.22699386503067484,\n \"acc_stderr\": 0.032910995786157686,\n\ \ \"acc_norm\": 0.22699386503067484,\n \"acc_norm_stderr\": 0.032910995786157686\n\ \ },\n \"harness|hendrycksTest-machine_learning|5\": {\n \"acc\": 0.2857142857142857,\n\ \ \"acc_stderr\": 0.04287858751340456,\n \"acc_norm\": 0.2857142857142857,\n\ \ \"acc_norm_stderr\": 0.04287858751340456\n },\n \"harness|hendrycksTest-management|5\"\ : {\n \"acc\": 0.1941747572815534,\n \"acc_stderr\": 0.03916667762822584,\n\ \ \"acc_norm\": 0.1941747572815534,\n \"acc_norm_stderr\": 0.03916667762822584\n\ \ },\n \"harness|hendrycksTest-marketing|5\": {\n \"acc\": 0.25213675213675213,\n\ \ \"acc_stderr\": 0.02844796547623101,\n \"acc_norm\": 0.25213675213675213,\n\ \ \"acc_norm_stderr\": 0.02844796547623101\n },\n \"harness|hendrycksTest-medical_genetics|5\"\ : {\n \"acc\": 0.27,\n \"acc_stderr\": 0.0446196043338474,\n \ \ \"acc_norm\": 0.27,\n \"acc_norm_stderr\": 0.0446196043338474\n },\n\ \ \"harness|hendrycksTest-miscellaneous|5\": {\n \"acc\": 0.26181353767560667,\n\ \ \"acc_stderr\": 0.015720838678445266,\n \"acc_norm\": 0.26181353767560667,\n\ \ \"acc_norm_stderr\": 0.015720838678445266\n },\n \"harness|hendrycksTest-moral_disputes|5\"\ : {\n \"acc\": 0.24566473988439305,\n \"acc_stderr\": 0.02317629820399201,\n\ \ \"acc_norm\": 0.24566473988439305,\n \"acc_norm_stderr\": 0.02317629820399201\n\ \ },\n \"harness|hendrycksTest-moral_scenarios|5\": {\n \"acc\": 0.25251396648044694,\n\ \ \"acc_stderr\": 0.014530330201468645,\n \"acc_norm\": 0.25251396648044694,\n\ \ \"acc_norm_stderr\": 0.014530330201468645\n },\n \"harness|hendrycksTest-nutrition|5\"\ : {\n \"acc\": 0.2647058823529412,\n \"acc_stderr\": 0.025261691219729487,\n\ \ \"acc_norm\": 0.2647058823529412,\n \"acc_norm_stderr\": 0.025261691219729487\n\ \ },\n \"harness|hendrycksTest-philosophy|5\": {\n \"acc\": 0.2508038585209003,\n\ \ \"acc_stderr\": 0.024619771956697165,\n \"acc_norm\": 0.2508038585209003,\n\ \ \"acc_norm_stderr\": 0.024619771956697165\n },\n \"harness|hendrycksTest-prehistory|5\"\ : {\n \"acc\": 0.22530864197530864,\n \"acc_stderr\": 0.02324620264781975,\n\ \ \"acc_norm\": 0.22530864197530864,\n \"acc_norm_stderr\": 0.02324620264781975\n\ \ },\n \"harness|hendrycksTest-professional_accounting|5\": {\n \"\ acc\": 0.2765957446808511,\n \"acc_stderr\": 0.026684564340461004,\n \ \ \"acc_norm\": 0.2765957446808511,\n \"acc_norm_stderr\": 0.026684564340461004\n\ \ },\n \"harness|hendrycksTest-professional_law|5\": {\n \"acc\": 0.25358539765319427,\n\ \ \"acc_stderr\": 0.011111715336101136,\n \"acc_norm\": 0.25358539765319427,\n\ \ \"acc_norm_stderr\": 0.011111715336101136\n },\n \"harness|hendrycksTest-professional_medicine|5\"\ : {\n \"acc\": 0.18382352941176472,\n \"acc_stderr\": 0.02352924218519311,\n\ \ \"acc_norm\": 0.18382352941176472,\n \"acc_norm_stderr\": 0.02352924218519311\n\ \ },\n \"harness|hendrycksTest-professional_psychology|5\": {\n \"\ acc\": 0.24673202614379086,\n \"acc_stderr\": 0.017440820367402493,\n \ \ \"acc_norm\": 0.24673202614379086,\n \"acc_norm_stderr\": 0.017440820367402493\n\ \ },\n \"harness|hendrycksTest-public_relations|5\": {\n \"acc\": 0.19090909090909092,\n\ \ \"acc_stderr\": 0.03764425585984927,\n \"acc_norm\": 0.19090909090909092,\n\ \ \"acc_norm_stderr\": 0.03764425585984927\n },\n \"harness|hendrycksTest-security_studies|5\"\ : {\n \"acc\": 0.18775510204081633,\n \"acc_stderr\": 0.02500025603954621,\n\ \ \"acc_norm\": 0.18775510204081633,\n \"acc_norm_stderr\": 0.02500025603954621\n\ \ },\n \"harness|hendrycksTest-sociology|5\": {\n \"acc\": 0.23383084577114427,\n\ \ \"acc_stderr\": 0.029929415408348384,\n \"acc_norm\": 0.23383084577114427,\n\ \ \"acc_norm_stderr\": 0.029929415408348384\n },\n \"harness|hendrycksTest-us_foreign_policy|5\"\ : {\n \"acc\": 0.32,\n \"acc_stderr\": 0.046882617226215034,\n \ \ \"acc_norm\": 0.32,\n \"acc_norm_stderr\": 0.046882617226215034\n \ \ },\n \"harness|hendrycksTest-virology|5\": {\n \"acc\": 0.25301204819277107,\n\ \ \"acc_stderr\": 0.03384429155233134,\n \"acc_norm\": 0.25301204819277107,\n\ \ \"acc_norm_stderr\": 0.03384429155233134\n },\n \"harness|hendrycksTest-world_religions|5\"\ : {\n \"acc\": 0.26900584795321636,\n \"acc_stderr\": 0.0340105262010409,\n\ \ \"acc_norm\": 0.26900584795321636,\n \"acc_norm_stderr\": 0.0340105262010409\n\ \ },\n \"harness|truthfulqa:mc|0\": {\n \"mc1\": 1.0,\n \"mc1_stderr\"\ : 0.0,\n \"mc2\": NaN,\n \"mc2_stderr\": NaN\n }\n}\n```" repo_url: https://huggingface.co/AIDC-ai-business/Marcoroni-70B leaderboard_url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard point_of_contact: clementine@hf.co configs: - config_name: harness_arc_challenge_25 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|arc:challenge|25_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|arc:challenge|25_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|arc:challenge|25_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|arc:challenge|25_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|arc:challenge|25_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hellaswag_10 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hellaswag|10_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hellaswag|10_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hellaswag|10_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hellaswag|10_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hellaswag|10_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-anatomy|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-astronomy|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-business_ethics|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-college_biology|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-college_chemistry|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-college_computer_science|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-college_mathematics|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-college_medicine|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-college_physics|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-computer_security|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-econometrics|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-formal_logic|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-global_facts|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-high_school_biology|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-high_school_geography|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-high_school_physics|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-human_aging|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-human_sexuality|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-international_law|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-jurisprudence|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-machine_learning|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-management|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-marketing|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-medical_genetics|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-miscellaneous|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-moral_disputes|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-nutrition|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-philosophy|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-prehistory|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-professional_accounting|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-professional_law|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-professional_medicine|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-professional_psychology|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-public_relations|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-security_studies|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-sociology|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-virology|5_2023-09-14T06-34-33.473104.parquet' - '**/details_harness|hendrycksTest-world_religions|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-anatomy|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-astronomy|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-business_ethics|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-college_biology|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-college_chemistry|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-college_computer_science|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-college_mathematics|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-college_medicine|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-college_physics|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-computer_security|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-econometrics|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-formal_logic|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-global_facts|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-high_school_biology|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-high_school_geography|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-high_school_physics|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-human_aging|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-human_sexuality|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-international_law|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-jurisprudence|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-machine_learning|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-management|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-marketing|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-medical_genetics|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-miscellaneous|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-moral_disputes|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-nutrition|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-philosophy|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-prehistory|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-professional_accounting|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-professional_law|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-professional_medicine|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-professional_psychology|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-public_relations|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-security_studies|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-sociology|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-virology|5_2023-09-14T19-48-28.878729.parquet' - '**/details_harness|hendrycksTest-world_religions|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-anatomy|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-astronomy|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-business_ethics|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-college_biology|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-college_chemistry|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-college_computer_science|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-college_mathematics|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-college_medicine|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-college_physics|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-computer_security|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-econometrics|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-formal_logic|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-global_facts|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-high_school_biology|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-high_school_geography|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-high_school_physics|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-human_aging|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-human_sexuality|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-international_law|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-jurisprudence|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-machine_learning|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-management|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-marketing|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-medical_genetics|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-miscellaneous|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-moral_disputes|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-nutrition|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-philosophy|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-prehistory|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-professional_accounting|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-professional_law|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-professional_medicine|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-professional_psychology|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-public_relations|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-security_studies|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-sociology|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-virology|5_2023-09-19T01-46-19.012527.parquet' - '**/details_harness|hendrycksTest-world_religions|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-anatomy|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-astronomy|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-business_ethics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-college_biology|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-college_chemistry|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-college_computer_science|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-college_mathematics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-college_medicine|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-college_physics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-computer_security|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-econometrics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-formal_logic|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-global_facts|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_biology|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_geography|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_physics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-human_aging|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-human_sexuality|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-international_law|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-jurisprudence|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-machine_learning|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-management|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-marketing|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-medical_genetics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-miscellaneous|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-moral_disputes|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-nutrition|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-philosophy|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-prehistory|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-professional_accounting|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-professional_law|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-professional_medicine|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-professional_psychology|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-public_relations|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-security_studies|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-sociology|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-virology|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-world_religions|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-anatomy|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-astronomy|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-business_ethics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-college_biology|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-college_chemistry|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-college_computer_science|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-college_mathematics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-college_medicine|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-college_physics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-computer_security|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-econometrics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-formal_logic|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-global_facts|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_biology|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_geography|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_physics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-human_aging|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-human_sexuality|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-international_law|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-jurisprudence|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-machine_learning|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-management|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-marketing|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-medical_genetics|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-miscellaneous|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-moral_disputes|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-nutrition|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-philosophy|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-prehistory|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-professional_accounting|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-professional_law|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-professional_medicine|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-professional_psychology|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-public_relations|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-security_studies|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-sociology|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-virology|5_2023-09-19T02-16-50.789886.parquet' - '**/details_harness|hendrycksTest-world_religions|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_abstract_algebra_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-abstract_algebra|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_anatomy_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-anatomy|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-anatomy|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-anatomy|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-anatomy|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-anatomy|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_astronomy_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-astronomy|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-astronomy|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-astronomy|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-astronomy|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-astronomy|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_business_ethics_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-business_ethics|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-business_ethics|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-business_ethics|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-business_ethics|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-business_ethics|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_clinical_knowledge_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-clinical_knowledge|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_college_biology_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-college_biology|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-college_biology|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-college_biology|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-college_biology|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-college_biology|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_college_chemistry_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-college_chemistry|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-college_chemistry|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-college_chemistry|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-college_chemistry|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-college_chemistry|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_college_computer_science_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-college_computer_science|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-college_computer_science|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-college_computer_science|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-college_computer_science|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-college_computer_science|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_college_mathematics_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-college_mathematics|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-college_mathematics|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-college_mathematics|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-college_mathematics|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-college_mathematics|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_college_medicine_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-college_medicine|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-college_medicine|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-college_medicine|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-college_medicine|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-college_medicine|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_college_physics_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-college_physics|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-college_physics|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-college_physics|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-college_physics|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-college_physics|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_computer_security_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-computer_security|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-computer_security|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-computer_security|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-computer_security|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-computer_security|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_conceptual_physics_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-conceptual_physics|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_econometrics_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-econometrics|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-econometrics|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-econometrics|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-econometrics|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-econometrics|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_electrical_engineering_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-electrical_engineering|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_elementary_mathematics_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-elementary_mathematics|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_formal_logic_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-formal_logic|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-formal_logic|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-formal_logic|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-formal_logic|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-formal_logic|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_global_facts_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-global_facts|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-global_facts|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-global_facts|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-global_facts|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-global_facts|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_high_school_biology_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-high_school_biology|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-high_school_biology|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-high_school_biology|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-high_school_biology|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_biology|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_high_school_chemistry_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_chemistry|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_high_school_computer_science_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_computer_science|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_high_school_european_history_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_european_history|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_high_school_geography_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-high_school_geography|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-high_school_geography|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-high_school_geography|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-high_school_geography|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_geography|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_high_school_government_and_politics_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_government_and_politics|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_high_school_macroeconomics_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_macroeconomics|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_high_school_mathematics_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_mathematics|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_high_school_microeconomics_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_microeconomics|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_high_school_physics_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-high_school_physics|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-high_school_physics|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-high_school_physics|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-high_school_physics|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_physics|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_high_school_psychology_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_psychology|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_high_school_statistics_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_statistics|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_high_school_us_history_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_us_history|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_high_school_world_history_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-high_school_world_history|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_human_aging_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-human_aging|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-human_aging|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-human_aging|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-human_aging|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-human_aging|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_human_sexuality_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-human_sexuality|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-human_sexuality|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-human_sexuality|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-human_sexuality|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-human_sexuality|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_international_law_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-international_law|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-international_law|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-international_law|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-international_law|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-international_law|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_jurisprudence_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-jurisprudence|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-jurisprudence|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-jurisprudence|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-jurisprudence|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-jurisprudence|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_logical_fallacies_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-logical_fallacies|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_machine_learning_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-machine_learning|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-machine_learning|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-machine_learning|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-machine_learning|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-machine_learning|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_management_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-management|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-management|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-management|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-management|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-management|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_marketing_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-marketing|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-marketing|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-marketing|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-marketing|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-marketing|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_medical_genetics_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-medical_genetics|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-medical_genetics|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-medical_genetics|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-medical_genetics|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-medical_genetics|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_miscellaneous_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-miscellaneous|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-miscellaneous|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-miscellaneous|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-miscellaneous|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-miscellaneous|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_moral_disputes_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-moral_disputes|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-moral_disputes|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-moral_disputes|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-moral_disputes|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-moral_disputes|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_moral_scenarios_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-moral_scenarios|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_nutrition_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-nutrition|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-nutrition|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-nutrition|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-nutrition|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-nutrition|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_philosophy_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-philosophy|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-philosophy|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-philosophy|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-philosophy|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-philosophy|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_prehistory_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-prehistory|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-prehistory|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-prehistory|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-prehistory|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-prehistory|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_professional_accounting_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-professional_accounting|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-professional_accounting|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-professional_accounting|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-professional_accounting|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-professional_accounting|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_professional_law_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-professional_law|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-professional_law|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-professional_law|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-professional_law|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-professional_law|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_professional_medicine_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-professional_medicine|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-professional_medicine|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-professional_medicine|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-professional_medicine|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-professional_medicine|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_professional_psychology_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-professional_psychology|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-professional_psychology|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-professional_psychology|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-professional_psychology|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-professional_psychology|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_public_relations_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-public_relations|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-public_relations|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-public_relations|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-public_relations|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-public_relations|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_security_studies_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-security_studies|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-security_studies|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-security_studies|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-security_studies|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-security_studies|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_sociology_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-sociology|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-sociology|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-sociology|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-sociology|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-sociology|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_us_foreign_policy_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-us_foreign_policy|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_virology_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-virology|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-virology|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-virology|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-virology|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-virology|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_hendrycksTest_world_religions_5 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|hendrycksTest-world_religions|5_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|hendrycksTest-world_religions|5_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|hendrycksTest-world_religions|5_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|hendrycksTest-world_religions|5_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|hendrycksTest-world_religions|5_2023-09-19T02-16-50.789886.parquet' - config_name: harness_truthfulqa_mc_0 data_files: - split: 2023_09_14T06_34_33.473104 path: - '**/details_harness|truthfulqa:mc|0_2023-09-14T06-34-33.473104.parquet' - split: 2023_09_14T19_48_28.878729 path: - '**/details_harness|truthfulqa:mc|0_2023-09-14T19-48-28.878729.parquet' - split: 2023_09_19T01_46_19.012527 path: - '**/details_harness|truthfulqa:mc|0_2023-09-19T01-46-19.012527.parquet' - split: 2023_09_19T02_16_50.789886 path: - '**/details_harness|truthfulqa:mc|0_2023-09-19T02-16-50.789886.parquet' - split: latest path: - '**/details_harness|truthfulqa:mc|0_2023-09-19T02-16-50.789886.parquet' - config_name: results data_files: - split: 2023_09_14T06_34_33.473104 path: - results_2023-09-14T06-34-33.473104.parquet - split: 2023_09_14T19_48_28.878729 path: - results_2023-09-14T19-48-28.878729.parquet - split: 2023_09_19T01_46_19.012527 path: - results_2023-09-19T01-46-19.012527.parquet - split: 2023_09_19T02_16_50.789886 path: - results_2023-09-19T02-16-50.789886.parquet - split: latest path: - results_2023-09-19T02-16-50.789886.parquet --- # Dataset Card for Evaluation run of AIDC-ai-business/Marcoroni-70B ## Dataset Description - **Homepage:** - **Repository:** https://huggingface.co/AIDC-ai-business/Marcoroni-70B - **Paper:** - **Leaderboard:** https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard - **Point of Contact:** clementine@hf.co ### Dataset Summary Dataset automatically created during the evaluation run of model [AIDC-ai-business/Marcoroni-70B](https://huggingface.co/AIDC-ai-business/Marcoroni-70B) on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard). The dataset is composed of 61 configuration, each one coresponding to one of the evaluated task. The dataset has been created from 4 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to the latest results. An additional configuration "results" store all the aggregated results of the run (and is used to compute and display the agregated metrics on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)). To load the details from a run, you can for instance do the following: ```python from datasets import load_dataset data = load_dataset("open-llm-leaderboard/details_AIDC-ai-business__Marcoroni-70B", "harness_truthfulqa_mc_0", split="train") ``` ## Latest results These are the [latest results from run 2023-09-19T02:16:50.789886](https://huggingface.co/datasets/open-llm-leaderboard/details_AIDC-ai-business__Marcoroni-70B/blob/main/results_2023-09-19T02-16-50.789886.json)(note that their might be results for other tasks in the repos if successive evals didn't cover the same tasks. You find each in the results and the "latest" split for each eval): ```python { "all": { "acc": 0.23992448312110085, "acc_stderr": 0.031078389352549952, "acc_norm": 0.24054395860756556, "acc_norm_stderr": 0.03108725267744147, "mc1": 1.0, "mc1_stderr": 0.0, "mc2": NaN, "mc2_stderr": NaN }, "harness|arc:challenge|25": { "acc": 0.24744027303754265, "acc_stderr": 0.012610352663292673, "acc_norm": 0.2790102389078498, "acc_norm_stderr": 0.013106784883601346 }, "harness|hellaswag|10": { "acc": 0.2621987651862179, "acc_stderr": 0.004389312748012152, "acc_norm": 0.2671778530173272, "acc_norm_stderr": 0.004415816696303084 }, "harness|hendrycksTest-abstract_algebra|5": { "acc": 0.19, "acc_stderr": 0.03942772444036624, "acc_norm": 0.19, "acc_norm_stderr": 0.03942772444036624 }, "harness|hendrycksTest-anatomy|5": { "acc": 0.18518518518518517, "acc_stderr": 0.0335567721631314, "acc_norm": 0.18518518518518517, "acc_norm_stderr": 0.0335567721631314 }, "harness|hendrycksTest-astronomy|5": { "acc": 0.21710526315789475, "acc_stderr": 0.033550453048829226, "acc_norm": 0.21710526315789475, "acc_norm_stderr": 0.033550453048829226 }, "harness|hendrycksTest-business_ethics|5": { "acc": 0.29, "acc_stderr": 0.04560480215720684, "acc_norm": 0.29, "acc_norm_stderr": 0.04560480215720684 }, "harness|hendrycksTest-clinical_knowledge|5": { "acc": 0.20754716981132076, "acc_stderr": 0.02495991802891127, "acc_norm": 0.20754716981132076, "acc_norm_stderr": 0.02495991802891127 }, "harness|hendrycksTest-college_biology|5": { "acc": 0.2638888888888889, "acc_stderr": 0.03685651095897532, "acc_norm": 0.2638888888888889, "acc_norm_stderr": 0.03685651095897532 }, "harness|hendrycksTest-college_chemistry|5": { "acc": 0.19, "acc_stderr": 0.039427724440366234, "acc_norm": 0.19, "acc_norm_stderr": 0.039427724440366234 }, "harness|hendrycksTest-college_computer_science|5": { "acc": 0.29, "acc_stderr": 0.045604802157206845, "acc_norm": 0.29, "acc_norm_stderr": 0.045604802157206845 }, "harness|hendrycksTest-college_mathematics|5": { "acc": 0.22, "acc_stderr": 0.04163331998932269, "acc_norm": 0.22, "acc_norm_stderr": 0.04163331998932269 }, "harness|hendrycksTest-college_medicine|5": { "acc": 0.18497109826589594, "acc_stderr": 0.029605623981771204, "acc_norm": 0.18497109826589594, "acc_norm_stderr": 0.029605623981771204 }, "harness|hendrycksTest-college_physics|5": { "acc": 0.22549019607843138, "acc_stderr": 0.041583075330832865, "acc_norm": 0.22549019607843138, "acc_norm_stderr": 0.041583075330832865 }, "harness|hendrycksTest-computer_security|5": { "acc": 0.25, "acc_stderr": 0.04351941398892446, "acc_norm": 0.25, "acc_norm_stderr": 0.04351941398892446 }, "harness|hendrycksTest-conceptual_physics|5": { "acc": 0.2723404255319149, "acc_stderr": 0.029101290698386705, "acc_norm": 0.2723404255319149, "acc_norm_stderr": 0.029101290698386705 }, "harness|hendrycksTest-econometrics|5": { "acc": 0.2807017543859649, "acc_stderr": 0.042270544512322, "acc_norm": 0.2807017543859649, "acc_norm_stderr": 0.042270544512322 }, "harness|hendrycksTest-electrical_engineering|5": { "acc": 0.25517241379310346, "acc_stderr": 0.03632984052707842, "acc_norm": 0.25517241379310346, "acc_norm_stderr": 0.03632984052707842 }, "harness|hendrycksTest-elementary_mathematics|5": { "acc": 0.24603174603174602, "acc_stderr": 0.022182037202948365, "acc_norm": 0.24603174603174602, "acc_norm_stderr": 0.022182037202948365 }, "harness|hendrycksTest-formal_logic|5": { "acc": 0.2777777777777778, "acc_stderr": 0.04006168083848876, "acc_norm": 0.2777777777777778, "acc_norm_stderr": 0.04006168083848876 }, "harness|hendrycksTest-global_facts|5": { "acc": 0.21, "acc_stderr": 0.040936018074033256, "acc_norm": 0.21, "acc_norm_stderr": 0.040936018074033256 }, "harness|hendrycksTest-high_school_biology|5": { "acc": 0.2, "acc_stderr": 0.022755204959542932, "acc_norm": 0.2, "acc_norm_stderr": 0.022755204959542932 }, "harness|hendrycksTest-high_school_chemistry|5": { "acc": 0.22167487684729065, "acc_stderr": 0.029225575892489607, "acc_norm": 0.22167487684729065, "acc_norm_stderr": 0.029225575892489607 }, "harness|hendrycksTest-high_school_computer_science|5": { "acc": 0.21, "acc_stderr": 0.04093601807403326, "acc_norm": 0.21, "acc_norm_stderr": 0.04093601807403326 }, "harness|hendrycksTest-high_school_european_history|5": { "acc": 0.2545454545454545, "acc_stderr": 0.0340150671524904, "acc_norm": 0.2545454545454545, "acc_norm_stderr": 0.0340150671524904 }, "harness|hendrycksTest-high_school_geography|5": { "acc": 0.20707070707070707, "acc_stderr": 0.02886977846026705, "acc_norm": 0.20707070707070707, "acc_norm_stderr": 0.02886977846026705 }, "harness|hendrycksTest-high_school_government_and_politics|5": { "acc": 0.2849740932642487, "acc_stderr": 0.03257714077709661, "acc_norm": 0.2849740932642487, "acc_norm_stderr": 0.03257714077709661 }, "harness|hendrycksTest-high_school_macroeconomics|5": { "acc": 0.23333333333333334, "acc_stderr": 0.021444547301560486, "acc_norm": 0.23333333333333334, "acc_norm_stderr": 0.021444547301560486 }, "harness|hendrycksTest-high_school_mathematics|5": { "acc": 0.22592592592592592, "acc_stderr": 0.025497532639609542, "acc_norm": 0.22592592592592592, "acc_norm_stderr": 0.025497532639609542 }, "harness|hendrycksTest-high_school_microeconomics|5": { "acc": 0.19747899159663865, "acc_stderr": 0.025859164122051467, "acc_norm": 0.19747899159663865, "acc_norm_stderr": 0.025859164122051467 }, "harness|hendrycksTest-high_school_physics|5": { "acc": 0.23841059602649006, "acc_stderr": 0.0347918557259966, "acc_norm": 0.23841059602649006, "acc_norm_stderr": 0.0347918557259966 }, "harness|hendrycksTest-high_school_psychology|5": { "acc": 0.21651376146788992, "acc_stderr": 0.017658710594443145, "acc_norm": 0.21651376146788992, "acc_norm_stderr": 0.017658710594443145 }, "harness|hendrycksTest-high_school_statistics|5": { "acc": 0.1712962962962963, "acc_stderr": 0.025695341643824685, "acc_norm": 0.1712962962962963, "acc_norm_stderr": 0.025695341643824685 }, "harness|hendrycksTest-high_school_us_history|5": { "acc": 0.25, "acc_stderr": 0.03039153369274154, "acc_norm": 0.25, "acc_norm_stderr": 0.03039153369274154 }, "harness|hendrycksTest-high_school_world_history|5": { "acc": 0.25738396624472576, "acc_stderr": 0.028458820991460302, "acc_norm": 0.25738396624472576, "acc_norm_stderr": 0.028458820991460302 }, "harness|hendrycksTest-human_aging|5": { "acc": 0.2914798206278027, "acc_stderr": 0.030500283176545902, "acc_norm": 0.2914798206278027, "acc_norm_stderr": 0.030500283176545902 }, "harness|hendrycksTest-human_sexuality|5": { "acc": 0.24427480916030533, "acc_stderr": 0.037683359597287434, "acc_norm": 0.24427480916030533, "acc_norm_stderr": 0.037683359597287434 }, "harness|hendrycksTest-international_law|5": { "acc": 0.2809917355371901, "acc_stderr": 0.04103203830514511, "acc_norm": 0.2809917355371901, "acc_norm_stderr": 0.04103203830514511 }, "harness|hendrycksTest-jurisprudence|5": { "acc": 0.25925925925925924, "acc_stderr": 0.042365112580946336, "acc_norm": 0.25925925925925924, "acc_norm_stderr": 0.042365112580946336 }, "harness|hendrycksTest-logical_fallacies|5": { "acc": 0.22699386503067484, "acc_stderr": 0.032910995786157686, "acc_norm": 0.22699386503067484, "acc_norm_stderr": 0.032910995786157686 }, "harness|hendrycksTest-machine_learning|5": { "acc": 0.2857142857142857, "acc_stderr": 0.04287858751340456, "acc_norm": 0.2857142857142857, "acc_norm_stderr": 0.04287858751340456 }, "harness|hendrycksTest-management|5": { "acc": 0.1941747572815534, "acc_stderr": 0.03916667762822584, "acc_norm": 0.1941747572815534, "acc_norm_stderr": 0.03916667762822584 }, "harness|hendrycksTest-marketing|5": { "acc": 0.25213675213675213, "acc_stderr": 0.02844796547623101, "acc_norm": 0.25213675213675213, "acc_norm_stderr": 0.02844796547623101 }, "harness|hendrycksTest-medical_genetics|5": { "acc": 0.27, "acc_stderr": 0.0446196043338474, "acc_norm": 0.27, "acc_norm_stderr": 0.0446196043338474 }, "harness|hendrycksTest-miscellaneous|5": { "acc": 0.26181353767560667, "acc_stderr": 0.015720838678445266, "acc_norm": 0.26181353767560667, "acc_norm_stderr": 0.015720838678445266 }, "harness|hendrycksTest-moral_disputes|5": { "acc": 0.24566473988439305, "acc_stderr": 0.02317629820399201, "acc_norm": 0.24566473988439305, "acc_norm_stderr": 0.02317629820399201 }, "harness|hendrycksTest-moral_scenarios|5": { "acc": 0.25251396648044694, "acc_stderr": 0.014530330201468645, "acc_norm": 0.25251396648044694, "acc_norm_stderr": 0.014530330201468645 }, "harness|hendrycksTest-nutrition|5": { "acc": 0.2647058823529412, "acc_stderr": 0.025261691219729487, "acc_norm": 0.2647058823529412, "acc_norm_stderr": 0.025261691219729487 }, "harness|hendrycksTest-philosophy|5": { "acc": 0.2508038585209003, "acc_stderr": 0.024619771956697165, "acc_norm": 0.2508038585209003, "acc_norm_stderr": 0.024619771956697165 }, "harness|hendrycksTest-prehistory|5": { "acc": 0.22530864197530864, "acc_stderr": 0.02324620264781975, "acc_norm": 0.22530864197530864, "acc_norm_stderr": 0.02324620264781975 }, "harness|hendrycksTest-professional_accounting|5": { "acc": 0.2765957446808511, "acc_stderr": 0.026684564340461004, "acc_norm": 0.2765957446808511, "acc_norm_stderr": 0.026684564340461004 }, "harness|hendrycksTest-professional_law|5": { "acc": 0.25358539765319427, "acc_stderr": 0.011111715336101136, "acc_norm": 0.25358539765319427, "acc_norm_stderr": 0.011111715336101136 }, "harness|hendrycksTest-professional_medicine|5": { "acc": 0.18382352941176472, "acc_stderr": 0.02352924218519311, "acc_norm": 0.18382352941176472, "acc_norm_stderr": 0.02352924218519311 }, "harness|hendrycksTest-professional_psychology|5": { "acc": 0.24673202614379086, "acc_stderr": 0.017440820367402493, "acc_norm": 0.24673202614379086, "acc_norm_stderr": 0.017440820367402493 }, "harness|hendrycksTest-public_relations|5": { "acc": 0.19090909090909092, "acc_stderr": 0.03764425585984927, "acc_norm": 0.19090909090909092, "acc_norm_stderr": 0.03764425585984927 }, "harness|hendrycksTest-security_studies|5": { "acc": 0.18775510204081633, "acc_stderr": 0.02500025603954621, "acc_norm": 0.18775510204081633, "acc_norm_stderr": 0.02500025603954621 }, "harness|hendrycksTest-sociology|5": { "acc": 0.23383084577114427, "acc_stderr": 0.029929415408348384, "acc_norm": 0.23383084577114427, "acc_norm_stderr": 0.029929415408348384 }, "harness|hendrycksTest-us_foreign_policy|5": { "acc": 0.32, "acc_stderr": 0.046882617226215034, "acc_norm": 0.32, "acc_norm_stderr": 0.046882617226215034 }, "harness|hendrycksTest-virology|5": { "acc": 0.25301204819277107, "acc_stderr": 0.03384429155233134, "acc_norm": 0.25301204819277107, "acc_norm_stderr": 0.03384429155233134 }, "harness|hendrycksTest-world_religions|5": { "acc": 0.26900584795321636, "acc_stderr": 0.0340105262010409, "acc_norm": 0.26900584795321636, "acc_norm_stderr": 0.0340105262010409 }, "harness|truthfulqa:mc|0": { "mc1": 1.0, "mc1_stderr": 0.0, "mc2": NaN, "mc2_stderr": NaN } } ``` ### Supported Tasks and Leaderboards [More Information Needed] ### Languages [More Information Needed] ## Dataset Structure ### Data Instances [More Information Needed] ### Data Fields [More Information Needed] ### Data Splits [More Information Needed] ## Dataset Creation ### Curation Rationale [More Information Needed] ### Source Data #### Initial Data Collection and Normalization [More Information Needed] #### Who are the source language producers? [More Information Needed] ### Annotations #### Annotation process [More Information Needed] #### Who are the annotators? [More Information Needed] ### Personal and Sensitive Information [More Information Needed] ## Considerations for Using the Data ### Social Impact of Dataset [More Information Needed] ### Discussion of Biases [More Information Needed] ### Other Known Limitations [More Information Needed] ## Additional Information ### Dataset Curators [More Information Needed] ### Licensing Information [More Information Needed] ### Citation Information [More Information Needed] ### Contributions [More Information Needed]
提供机构:
open-llm-leaderboard
原始信息汇总

数据集概述

数据集简介

该数据集是在模型 AIDC-ai-business/Marcoroni-70B 的评估运行期间自动创建的,用于 Open LLM Leaderboard

数据集结构

  • 配置数量:61个配置,每个配置对应一个评估任务。
  • 运行次数:数据集从4次运行中创建,每次运行在每个配置中作为一个特定的分割,分割名称使用运行的时间戳。
  • 最新结果:"train" 分割始终指向最新的结果。
  • 汇总结果:一个额外的配置 "results" 存储所有运行的汇总结果,用于计算和显示 Open LLM Leaderboard 上的聚合指标。

数据加载示例

python from datasets import load_dataset data = load_dataset("open-llm-leaderboard/details_AIDC-ai-business__Marcoroni-70B", "harness_truthfulqa_mc_0", split="train")

最新结果

以下是 2023-09-19T02:16:50.789886 运行的最新结果: python { "all": { "acc": 0.23992448312110085, "acc_stderr": 0.031078389352549952, "acc_norm": 0.24054395860756556, "acc_norm_stderr": 0.03108725267744147, "mc1": 1.0, "mc1_stderr": 0.0, "mc2": NaN, "mc2_stderr": NaN }, "harness|arc:challenge|25": { "acc": 0.24744027303754265, "acc_stderr": 0.012610352663292673, "acc_norm": 0.2790102389078498, "acc_norm_stderr": 0.013106784883601346 }, "harness|hellaswag|10": { "acc": 0.2621987651862179, "acc_stderr": 0.004389312748012152, "acc_norm": 0.2671778530173272, "acc_norm_stderr": 0.004415816696303084 }, "harness|hendrycksTest-abstract_algebra|5": { "acc": 0.19, "acc_stderr": 0.03942772444036624, "acc_norm": 0.19, "acc_norm_stderr": 0.03942772444036624 }, "harness|hendrycksTest-anatomy|5": { "acc": 0.18518518518518517, "acc_stderr": 0.0335567721631314, "acc_norm": 0.18518518518518517, "acc_norm_stderr": 0.0335567721631314 }, "harness|hendrycksTest-astronomy|5": { "acc": 0.21710526315789475, "acc_stderr": 0.033550453048829226, "acc_norm": 0.21710526315789475, "acc_norm_stderr": 0.033550453048829226 }, "harness|hendrycksTest-business_ethics|5": { "acc": 0.29, "acc_stderr": 0.04560480215720684, "acc_norm": 0.29, "acc_norm_stderr": 0.04560480215720684 }, "harness|hendrycksTest-clinical_knowledge|5": { "acc": 0.20754716981132076, "acc_stderr": 0.02495991802891127, "acc_norm": 0.20754716981132076, "acc_norm_stderr": 0.02495991802891127 }, "harness|hendrycksTest-college_biology|5": { "acc": 0.2638888888888889, "acc_stderr": 0.03685651095897532, "acc_norm": 0.2638888888888889, "acc_norm_stderr": 0.03685651095897532 }, "harness|hendrycksTest-college_chemistry|5": { "acc": 0.19, "acc_stderr": 0.039427724440366234, "acc_norm": 0.19, "acc_norm_stderr": 0.039427724440366234 }, "harness|hendrycksTest-college_computer_science|5": { "acc": 0.29, "acc_stderr": 0.045604802157206845, "acc_norm": 0.29, "acc_norm_stderr": 0.045604802157206845 }, "harness|hendrycksTest-college_mathematics|5": { "acc": 0.22, "acc_stderr": 0.04163331998932269, "acc_norm": 0.22, "acc_norm_stderr": 0.04163331998932269 }, "harness|hendrycksTest-college_medicine|5": { "acc": 0.18497109826589594, "acc_stderr": 0.029605623981771204, "acc_norm": 0.18497109826589594, "acc_norm_stderr": 0.029605623981771204 }, "harness|hendrycksTest-college_physics|5": { "acc": 0.22549019607843138, "acc_stderr": 0.041583075330832865, "acc_norm": 0.22549019607843138, "acc_norm_stderr": 0.041583075330832865 }, "harness|hendrycksTest-computer_security|5": { "acc": 0.25, "acc_stderr": 0.04351941398892446, "acc_norm": 0.25, "acc_norm_stderr": 0.04351941398892446 }, "harness|hendrycksTest-conceptual_physics|5": { "acc": 0.2723404255319149, "acc_stderr": 0.029101290698386705, "acc_norm": 0.2723404255319149, "acc_norm_stderr": 0.029101290698386705 }, "harness|hendrycksTest-econometrics|5": { "acc": 0.2807017543859649, "acc_stderr": 0.042270544512322, "acc_norm": 0.2807017543859649, "acc_norm_stderr": 0.042270544512322 }, "harness|hendrycksTest-electrical_engineering|5": { "acc": 0.25517241379310346, "acc_stderr": 0.03632984052707842, "acc_norm": 0.25517241379310346, "acc_norm_stderr": 0.03632984052707842 }, "harness|hendrycksTest-elementary_mathematics|5": { "acc": 0.24603174603174602, "acc_stderr": 0.022182037202948365, "acc_norm": 0.24603174603174602, "acc_norm_stderr": 0.022182037202948365 }, "harness|hendrycksTest-formal_logic|5": { "acc": 0.2777777777777778, "acc_stderr": 0.04006168083848876, "acc_norm": 0.2777777777777778, "acc_norm_stderr": 0.04006168083848876 }, "harness|hendrycksTest-global_facts|5": { "acc": 0.21, "acc_stderr": 0.040936018074033256, "acc_norm": 0.21, "acc_norm_stderr": 0.040936018074033256 }, "harness|hendrycksTest-high_school_biology|5": { "acc": 0.2, "acc_stderr": 0.022755204959542932, "acc_norm": 0.2, "acc_norm_stderr": 0.022755204959542932 }, "harness|hendrycksTest-high_school_chemistry|5": { "acc": 0.22167487684729065, "acc_stderr": 0.029225575892489607, "acc_norm": 0.22167487684729065, "acc_norm_stderr": 0.029225575892489607 }, "harness|hendrycksTest-high_school_computer_science|5": { "acc": 0.21, "acc_stderr": 0.04093601807403326, "acc_norm": 0.21, "acc_norm_stderr": 0.04093601807403326 }, "harness|hendrycksTest-high_school_european_history|5": { "acc": 0.2545454545454545, "acc_stderr": 0.0340150671524904, "acc_norm": 0.2545454545454545, "acc_norm_stderr": 0.0340150671524904 }, "harness|hendrycksTest-high_school_geography|5": { "acc": 0.20707070707070707, "acc_stderr": 0.02886977846026705, "acc_norm": 0.20707070707070707, "acc_norm_stderr": 0.02886977846026705 }, "harness|hendrycksTest-high_school_government_and_politics|5": { "acc": 0.2849740932642487, "acc_stderr": 0.03257714077709661, "acc_norm": 0.2849740932642487, "acc_norm_stderr": 0.03257714077709661 }, "harness|hendrycksTest-high_school_macroeconomics|5": { "acc": 0.23333333333333334, "acc_stderr": 0.021444547301560486, "acc_norm": 0.23333333333333334, "acc_norm_stderr": 0.021444547301560486 }, "harness|hendrycksTest-high_school_mathematics|5": { "acc": 0.22592592592592592, "acc_stderr": 0.025497532639609542, "acc_norm": 0.22592592592592592, "acc_norm_stderr": 0.025497532639609542 }, "harness|hendrycksTest-high_school_microeconomics|5": { "acc": 0.1974789

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作