OALL/details_failspy__Meta-Llama-3-70B-Instruct-abliterated-v3.5
收藏数据集概述
该数据集是在评估模型failspy/Meta-Llama-3-70B-Instruct-abliterated-v3.5的过程中自动创建的。数据集包含136个配置,每个配置对应一个评估任务。
数据集结构
- 数据集从1次运行中创建,每个运行可以在每个配置中找到特定的分割,分割名称使用运行的时间戳。
- "train"分割始终指向最新的结果。
- 额外的配置"results"存储所有运行的聚合结果。
数据加载示例
python from datasets import load_dataset data = load_dataset("OALL/details_failspy__Meta-Llama-3-70B-Instruct-abliterated-v3.5", "lighteval_xstory_cloze_ar_0", split="train")
最新结果
以下是2024年6月7日21:51:38.744497运行的最新结果:
python { "all": { "acc_norm": 0.48929954889201877, "acc_norm_stderr": 0.03820931199184454, "acc": 0.6618133686300464, "acc_stderr": 0.012174678796437402 }, "community|acva:Algeria|0": { "acc_norm": 0.5282051282051282, "acc_norm_stderr": 0.035840746749208334 }, "community|acva:Ancient_Egypt|0": { "acc_norm": 0.11428571428571428, "acc_norm_stderr": 0.01795469260698176 }, "community|acva:Arab_Empire|0": { "acc_norm": 0.30943396226415093, "acc_norm_stderr": 0.028450154794118627 }, "community|acva:Arabic_Architecture|0": { "acc_norm": 0.48205128205128206, "acc_norm_stderr": 0.0358747709877383 }, "community|acva:Arabic_Art|0": { "acc_norm": 0.3384615384615385, "acc_norm_stderr": 0.03397280032734094 }, "community|acva:Arabic_Astronomy|0": { "acc_norm": 0.47692307692307695, "acc_norm_stderr": 0.0358596530894741 }, "community|acva:Arabic_Calligraphy|0": { "acc_norm": 0.5254901960784314, "acc_norm_stderr": 0.031331994785831645 }, "community|acva:Arabic_Ceremony|0": { "acc_norm": 0.5243243243243243, "acc_norm_stderr": 0.0368168445060319 }, "community|acva:Arabic_Clothing|0": { "acc_norm": 0.5128205128205128, "acc_norm_stderr": 0.03588610523192215 }, "community|acva:Arabic_Culture|0": { "acc_norm": 0.31794871794871793, "acc_norm_stderr": 0.03343383454355787 }, "community|acva:Arabic_Food|0": { "acc_norm": 0.441025641025641, "acc_norm_stderr": 0.0356473293185358 }, "community|acva:Arabic_Funeral|0": { "acc_norm": 0.4, "acc_norm_stderr": 0.050529115263991134 }, "community|acva:Arabic_Geography|0": { "acc_norm": 0.6068965517241379, "acc_norm_stderr": 0.040703290137070705 }, "community|acva:Arabic_History|0": { "acc_norm": 0.3333333333333333, "acc_norm_stderr": 0.03384487217112063 }, "community|acva:Arabic_Language_Origin|0": { "acc_norm": 0.5473684210526316, "acc_norm_stderr": 0.051339113773544845 }, "community|acva:Arabic_Literature|0": { "acc_norm": 0.4689655172413793, "acc_norm_stderr": 0.04158632762097828 }, "community|acva:Arabic_Math|0": { "acc_norm": 0.30256410256410254, "acc_norm_stderr": 0.03298070870085618 }, "community|acva:Arabic_Medicine|0": { "acc_norm": 0.5103448275862069, "acc_norm_stderr": 0.04165774775728763 }, "community|acva:Arabic_Music|0": { "acc_norm": 0.2805755395683453, "acc_norm_stderr": 0.03824529014900685 }, "community|acva:Arabic_Ornament|0": { "acc_norm": 0.47692307692307695, "acc_norm_stderr": 0.0358596530894741 }, "community|acva:Arabic_Philosophy|0": { "acc_norm": 0.5793103448275863, "acc_norm_stderr": 0.0411391498118926 }, "community|acva:Arabic_Physics_and_Chemistry|0": { "acc_norm": 0.6666666666666666, "acc_norm_stderr": 0.03384487217112064 }, "community|acva:Arabic_Wedding|0": { "acc_norm": 0.4358974358974359, "acc_norm_stderr": 0.035601666623466345 }, "community|acva:Bahrain|0": { "acc_norm": 0.3111111111111111, "acc_norm_stderr": 0.06979205927323111 }, "community|acva:Comoros|0": { "acc_norm": 0.4222222222222222, "acc_norm_stderr": 0.07446027270295805 }, "community|acva:Egypt_modern|0": { "acc_norm": 0.35789473684210527, "acc_norm_stderr": 0.04944436957628253 }, "community|acva:InfluenceFromAncientEgypt|0": { "acc_norm": 0.6102564102564103, "acc_norm_stderr": 0.035014247762563705 }, "community|acva:InfluenceFromByzantium|0": { "acc_norm": 0.7172413793103448, "acc_norm_stderr": 0.03752833958003337 }, "community|acva:InfluenceFromChina|0": { "acc_norm": 0.26666666666666666, "acc_norm_stderr": 0.0317493043641267 }, "community|acva:InfluenceFromGreece|0": { "acc_norm": 0.6410256410256411, "acc_norm_stderr": 0.03444042881521377 }, "community|acva:InfluenceFromIslam|0": { "acc_norm": 0.5103448275862069, "acc_norm_stderr": 0.04165774775728763 }, "community|acva:InfluenceFromPersia|0": { "acc_norm": 0.7028571428571428, "acc_norm_stderr": 0.03464507889884372 }, "community|acva:InfluenceFromRome|0": { "acc_norm": 0.5743589743589743, "acc_norm_stderr": 0.03549871080367708 }, "community|acva:Iraq|0": { "acc_norm": 0.5411764705882353, "acc_norm_stderr": 0.0543691634273002 }, "community|acva:Islam_Education|0": { "acc_norm": 0.5025641025641026, "acc_norm_stderr": 0.035897435897435895 }, "community|acva:Islam_branches_and_schools|0": { "acc_norm": 0.4742857142857143, "acc_norm_stderr": 0.037854741690433576 }, "community|acva:Islamic_law_system|0": { "acc_norm": 0.5948717948717949, "acc_norm_stderr": 0.03524577495610961 }, "community|acva:Jordan|0": { "acc_norm": 0.4222222222222222, "acc_norm_stderr": 0.07446027270295806 }, "community|acva:Kuwait|0": { "acc_norm": 0.3333333333333333, "acc_norm_stderr": 0.07106690545187014 }, "community|acva:Lebanon|0": { "acc_norm": 0.24444444444444444, "acc_norm_stderr": 0.06478835438717001 }, "community|acva:Libya|0": { "acc_norm": 0.4666666666666667, "acc_norm_stderr": 0.0752101433090355 }, "community|acva:Mauritania|0": { "acc_norm": 0.4222222222222222, "acc_norm_stderr": 0.07446027270295805 }, "community|acva:Mesopotamia_civilization|0": { "acc_norm": 0.535483870967742, "acc_norm_stderr": 0.040189558547478466 }, "community|acva:Morocco|0": { "acc_norm": 0.24444444444444444, "acc_norm_stderr": 0.06478835438717 }, "community|acva:Oman|0": { "acc_norm": 0.26666666666666666, "acc_norm_stderr": 0.06666666666666665 }, "community|acva:Palestine|0": { "acc_norm": 0.3176470588235294, "acc_norm_stderr": 0.05079691179733582 }, "community|acva:Qatar|0": { "acc_norm": 0.4222222222222222, "acc_norm_stderr": 0.07446027270295806 }, "community|acva:Saudi_Arabia|0": { "acc_norm": 0.35384615384615387, "acc_norm_stderr": 0.03433004254147036 }, "community|acva:Somalia|0": { "acc_norm": 0.4, "acc_norm_stderr": 0.07385489458759965 }, "community|acva:Sudan|0": { "acc_norm": 0.422



