five

oobabooga/preset-arena

收藏
Hugging Face2023-06-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/oobabooga/preset-arena
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 --- # Preset Arena dataset ## Description * **dataset.json**: contains pairs of completions generated with different presets for the same prompts. The chat prompts were constructed based on [SODA](https://huggingface.co/datasets/allenai/soda), whereas the instruct prompts were extracted from [WizardLM_evol_instruct_70k](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_70k). * **votes.json**: the votes given by users. Each vote contains two fields: the row number, and either "left" or "right". For instance, ["instruct", 2982, "left"] corresponds to data["instruct"][2982], where the user chose left (preset1). The alternative would be right, corresponding to preset2. The indexing starts at 0 (like Python). * **presets.zip**: the preset definitions. They are applied on top of the default below. * **elo-score-ranking.csv**: an elo score ranking generated from the data. ## Top voters 1) Phosay: 186 votes 2) mindrage: 170 votes 3) xytarez: 153 votes 4) jllllll: 146 votes 5) acrastt: 131 votes 6) Nancy: 112 votes 7) oobabooga: 97 votes 8) jackork: 78 votes 9) Moootpoint: 77 votes 10) Aohai: 62 votes 11) samfundev: 53 votes 12) Frank Liu: 52 votes 13) marianbasti: 42 votes 14) altoiddealer: 41 votes 15) NoProtocol: 40 votes 16) hyunahri: 37 votes 17) alto: 35 votes 18) Kane Hudson: 35 votes 19) satothedude: 30 votes 20) hu: 30 votes Honorary mentions: Alear, Vadimluck, Cereal Velocity, Rimants Sakins, Tostino, Soup, Nix, Calem, YearZero, Drilldo, The_AI_Fan, Lylepaul78, Cypherfox, jcru, meditans, Thunder tree, Miller, MAIdragora, test, Mystifistisk, KOTOB, DerKruste, Rylan Taylor, eunone, Matilde Ametrine, ooodi, axutio, Pyrater, DR, ALEX, volrath50, imakesound, byttle, Ragora, Phillip Lin, BlackDragonBE, underlines, ragnaruss, psychoworsed, jbluew, eiery, WolframRavenwolf, Seri, Seppl, Minh, Joe Biden (Real), Hero, thelustriva, laobao, beno, TheVolkLisa, ElectronSpiderwort, Chromix, Cebtenzzre, cherubble, The Prism, SunCardinal, Root, Ratieu, Fuingo, Fire, Dolyfin, jzinno, gourdo, giesse, WalterMcMelon, Durnehviir, David_337, Dacxel, Charles Goddard, zhou biden, semilucidtrip, ratana, lounger., jarnMod, cack, Yuuru, YSM, Squirrelly, Rockferd, Phil, Pathos, Nick292929, Michael Fraser, Lucifer, Jason Earnest Coker, 1980Dragon, wecardo, universewithtin, kusoge, grummxvx, codynhanpham, abrisene, Tuna, PretzelVector, zyugyzarc, smythreens, o, ninecats, mystic_wiz, morphles, ilu, elperson, cyanf, c0sogi, Winter, Whoever, PlatinaCoder, Manuel Materazzo, HayDoru, Graham Reed, FlyingBanana391, Dark, rerri, rat, jojo, heZamelliac, haha, bunny, belladore.ai, andy, WadRex, Vokturz, Tivi, Tehehe, Streak, Rikikav, Panchovix, MissHentai, Latent, Incomple_, Biogoly, BalTac, Axodus, Andvig, xcoolcoinx, shinkarom, sectix, nikronic, ioujn, hong, gf, cl, bumda, alain40, Xad, Wolokin, Stefan, Romonoss, PresetWin!, Pawit, Nightcall, Muba, Matheus, Mash, Koray, Gerald, Finx, Draco25240, Bart, smashmaster0045, sfdf, pvm, nanowell , hi, eloitor, camronbergh, XD, Vfrap, Timmy, Som, Rain, Mior, Krisu, Hhm, Gabrieldelyon, Fellowship, Daniq, CyberTimon, Brian, ApparentlyAnAI, A, 11 ## Default parameters ```python generate_params = { 'do_sample': True, 'temperature': 1, 'top_p': 1, 'typical_p': 1, 'epsilon_cutoff': 0, 'eta_cutoff': 0, 'tfs': 1, 'top_a': 0, 'repetition_penalty': 1, 'encoder_repetition_penalty': 1, 'top_k': 0, 'num_beams': 1, 'penalty_alpha': 0, 'min_length': 0, 'length_penalty': 1, 'no_repeat_ngram_size': 0, 'early_stopping': False, 'mirostat_mode': 0, 'mirostat_tau': 5.0, 'mirostat_eta': 0.1, } ``` ## Models These models were used for the completions: * Instruct prompts: Vicuna 13b v1.1 (GPTQ, 4-bit, 128g). * Chat prompts: LLaMA 13b (GPTQ, 4-bit, 128g).
提供机构:
oobabooga
原始信息汇总

Preset Arena dataset

数据集结构

  • dataset.json: 包含基于SODA构建的聊天提示和从WizardLM_evol_instruct_70k提取的指令提示的不同预设生成的完成对。
  • votes.json: 用户投票记录,每个投票包含行号和“左”或“右”。例如,["instruct", 2982, "left"]对应于data["instruct"][2982],用户选择了左(预设1)。
  • presets.zip: 预设定义文件,应用于默认参数之上。
  • elo-score-ranking.csv: 基于数据的elo分数排名。

默认参数

python generate_params = { do_sample: True, temperature: 1, top_p: 1, typical_p: 1, epsilon_cutoff: 0, eta_cutoff: 0, tfs: 1, top_a: 0, repetition_penalty: 1, encoder_repetition_penalty: 1, top_k: 0, num_beams: 1, penalty_alpha: 0, min_length: 0, length_penalty: 1, no_repeat_ngram_size: 0, early_stopping: False, mirostat_mode: 0, mirostat_tau: 5.0, mirostat_eta: 0.1, }

使用的模型

  • Instruct prompts: Vicuna 13b v1.1 (GPTQ, 4-bit, 128g).
  • Chat prompts: LLaMA 13b (GPTQ, 4-bit, 128g).
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作