oobabooga/preset-arena
收藏Hugging Face2023-06-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/oobabooga/preset-arena
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
---
# Preset Arena dataset
## Description
* **dataset.json**: contains pairs of completions generated with different presets for the same prompts. The chat prompts were constructed based on [SODA](https://huggingface.co/datasets/allenai/soda), whereas the instruct prompts were extracted from [WizardLM_evol_instruct_70k](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_70k).
* **votes.json**: the votes given by users. Each vote contains two fields: the row number, and either "left" or "right". For instance, ["instruct", 2982, "left"] corresponds to data["instruct"][2982], where the user chose left (preset1). The alternative would be right, corresponding to preset2. The indexing starts at 0 (like Python).
* **presets.zip**: the preset definitions. They are applied on top of the default below.
* **elo-score-ranking.csv**: an elo score ranking generated from the data.
## Top voters
1) Phosay: 186 votes
2) mindrage: 170 votes
3) xytarez: 153 votes
4) jllllll: 146 votes
5) acrastt: 131 votes
6) Nancy: 112 votes
7) oobabooga: 97 votes
8) jackork: 78 votes
9) Moootpoint: 77 votes
10) Aohai: 62 votes
11) samfundev: 53 votes
12) Frank Liu: 52 votes
13) marianbasti: 42 votes
14) altoiddealer: 41 votes
15) NoProtocol: 40 votes
16) hyunahri: 37 votes
17) alto: 35 votes
18) Kane Hudson: 35 votes
19) satothedude: 30 votes
20) hu: 30 votes
Honorary mentions: Alear, Vadimluck, Cereal Velocity, Rimants Sakins, Tostino, Soup, Nix, Calem, YearZero, Drilldo, The_AI_Fan, Lylepaul78, Cypherfox, jcru, meditans, Thunder tree, Miller, MAIdragora, test, Mystifistisk, KOTOB, DerKruste, Rylan Taylor, eunone, Matilde Ametrine, ooodi, axutio, Pyrater, DR, ALEX, volrath50, imakesound, byttle, Ragora, Phillip Lin, BlackDragonBE, underlines, ragnaruss, psychoworsed, jbluew, eiery, WolframRavenwolf, Seri, Seppl, Minh, Joe Biden (Real), Hero, thelustriva, laobao, beno, TheVolkLisa, ElectronSpiderwort, Chromix, Cebtenzzre, cherubble, The Prism, SunCardinal, Root, Ratieu, Fuingo, Fire, Dolyfin, jzinno, gourdo, giesse, WalterMcMelon, Durnehviir, David_337, Dacxel, Charles Goddard, zhou biden, semilucidtrip, ratana, lounger., jarnMod, cack, Yuuru, YSM, Squirrelly, Rockferd, Phil, Pathos, Nick292929, Michael Fraser, Lucifer, Jason Earnest Coker, 1980Dragon, wecardo, universewithtin, kusoge, grummxvx, codynhanpham, abrisene, Tuna, PretzelVector, zyugyzarc, smythreens, o, ninecats, mystic_wiz, morphles, ilu, elperson, cyanf, c0sogi, Winter, Whoever, PlatinaCoder, Manuel Materazzo, HayDoru, Graham Reed, FlyingBanana391, Dark, rerri, rat, jojo, heZamelliac, haha, bunny, belladore.ai, andy, WadRex, Vokturz, Tivi, Tehehe, Streak, Rikikav, Panchovix, MissHentai, Latent, Incomple_, Biogoly, BalTac, Axodus, Andvig, xcoolcoinx, shinkarom, sectix, nikronic, ioujn, hong, gf, cl, bumda, alain40, Xad, Wolokin, Stefan, Romonoss, PresetWin!, Pawit, Nightcall, Muba, Matheus, Mash, Koray, Gerald, Finx, Draco25240, Bart, smashmaster0045, sfdf, pvm, nanowell , hi, eloitor, camronbergh, XD, Vfrap, Timmy, Som, Rain, Mior, Krisu, Hhm, Gabrieldelyon, Fellowship, Daniq, CyberTimon, Brian, ApparentlyAnAI, A, 11
## Default parameters
```python
generate_params = {
'do_sample': True,
'temperature': 1,
'top_p': 1,
'typical_p': 1,
'epsilon_cutoff': 0,
'eta_cutoff': 0,
'tfs': 1,
'top_a': 0,
'repetition_penalty': 1,
'encoder_repetition_penalty': 1,
'top_k': 0,
'num_beams': 1,
'penalty_alpha': 0,
'min_length': 0,
'length_penalty': 1,
'no_repeat_ngram_size': 0,
'early_stopping': False,
'mirostat_mode': 0,
'mirostat_tau': 5.0,
'mirostat_eta': 0.1,
}
```
## Models
These models were used for the completions:
* Instruct prompts: Vicuna 13b v1.1 (GPTQ, 4-bit, 128g).
* Chat prompts: LLaMA 13b (GPTQ, 4-bit, 128g).
提供机构:
oobabooga
原始信息汇总
Preset Arena dataset
数据集结构
- dataset.json: 包含基于SODA构建的聊天提示和从WizardLM_evol_instruct_70k提取的指令提示的不同预设生成的完成对。
- votes.json: 用户投票记录,每个投票包含行号和“左”或“右”。例如,["instruct", 2982, "left"]对应于data["instruct"][2982],用户选择了左(预设1)。
- presets.zip: 预设定义文件,应用于默认参数之上。
- elo-score-ranking.csv: 基于数据的elo分数排名。
默认参数
python generate_params = { do_sample: True, temperature: 1, top_p: 1, typical_p: 1, epsilon_cutoff: 0, eta_cutoff: 0, tfs: 1, top_a: 0, repetition_penalty: 1, encoder_repetition_penalty: 1, top_k: 0, num_beams: 1, penalty_alpha: 0, min_length: 0, length_penalty: 1, no_repeat_ngram_size: 0, early_stopping: False, mirostat_mode: 0, mirostat_tau: 5.0, mirostat_eta: 0.1, }
使用的模型
- Instruct prompts: Vicuna 13b v1.1 (GPTQ, 4-bit, 128g).
- Chat prompts: LLaMA 13b (GPTQ, 4-bit, 128g).



