five

tytodd/50k-v2

收藏
Hugging Face2026-04-24 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/tytodd/50k-v2
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: aes2_essay_scoring features: - name: full_text dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 3141336 num_examples: 1470 - name: val num_bytes: 311168 num_examples: 145 download_size: 1878198 dataset_size: 3452504 - config_name: arc_challenge features: - name: question dtype: string - name: choices dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: ood num_bytes: 118975 num_examples: 400 download_size: 66689 dataset_size: 118975 - config_name: argument_quality_ranking features: - name: topic dtype: string - name: argument dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: ood num_bytes: 81441 num_examples: 400 download_size: 29710 dataset_size: 81441 - config_name: bbh_snarks features: - name: question dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: ood num_bytes: 42029 num_examples: 178 download_size: 17563 dataset_size: 42029 - config_name: civil_comments features: - name: comment dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 517182 num_examples: 1470 - name: val num_bytes: 50421 num_examples: 145 download_size: 340222 dataset_size: 567603 - config_name: code_judge_bench features: - name: problem dtype: string - name: code_A dtype: string - name: code_B dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: ood num_bytes: 4342498 num_examples: 400 download_size: 1838386 dataset_size: 4342498 - config_name: colbert_humor_detection features: - name: text dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 175525 num_examples: 1470 - name: val num_bytes: 17011 num_examples: 145 download_size: 100553 dataset_size: 192536 - config_name: gpqa_diamond features: - name: question dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: ood num_bytes: 130052 num_examples: 198 download_size: 72069 dataset_size: 130052 - config_name: halueval_summarization features: - name: document dtype: string - name: summary dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: ood num_bytes: 1451195 num_examples: 400 download_size: 895936 dataset_size: 1451195 - config_name: hh_rlhf features: - name: question dtype: string - name: response_A dtype: string - name: response_B dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 1280952 num_examples: 1470 - name: val num_bytes: 118216 num_examples: 145 download_size: 804365 dataset_size: 1399168 - config_name: judge_bench features: - name: question dtype: string - name: response_A dtype: string - name: response_B dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: ood num_bytes: 1839316 num_examples: 350 download_size: 828316 dataset_size: 1839316 - config_name: lex_glue_case_hold features: - name: context dtype: string - name: option_a dtype: string - name: option_b dtype: string - name: option_c dtype: string - name: option_d dtype: string - name: option_e dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 2491337 num_examples: 1470 - name: val num_bytes: 249886 num_examples: 145 download_size: 1562749 dataset_size: 2741223 - config_name: lex_glue_scotus features: - name: opinion_text dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 40202497 num_examples: 1470 - name: val num_bytes: 8814658 num_examples: 145 download_size: 25592805 dataset_size: 49017155 - config_name: medical_abstracts features: - name: medical_abstract dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 1899354 num_examples: 1470 - name: val num_bytes: 181733 num_examples: 145 download_size: 1094389 dataset_size: 2081087 - config_name: mfrc features: - name: text dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 396651 num_examples: 1470 - name: val num_bytes: 39637 num_examples: 145 download_size: 116987 dataset_size: 436288 - config_name: mmlu features: - name: question dtype: string - name: choices dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: ood num_bytes: 108220 num_examples: 400 download_size: 55397 dataset_size: 108220 - config_name: mmlu_pro features: - name: question dtype: string - name: choices dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: ood num_bytes: 192519 num_examples: 400 download_size: 106386 dataset_size: 192519 - config_name: musr_murder_mysteries features: - name: question dtype: string - name: choices dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: ood num_bytes: 1401638 num_examples: 250 download_size: 807850 dataset_size: 1401638 - config_name: musr_object_placements features: - name: question dtype: string - name: choices dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: ood num_bytes: 1316095 num_examples: 256 download_size: 295447 dataset_size: 1316095 - config_name: musr_team_allocation features: - name: question dtype: string - name: choices dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: ood num_bytes: 868814 num_examples: 250 download_size: 507115 dataset_size: 868814 - config_name: or_bench_80k features: - name: prompt dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 247241 num_examples: 1470 - name: val num_bytes: 24561 num_examples: 145 download_size: 120606 dataset_size: 271802 - config_name: or_bench_hard_1k features: - name: prompt dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 172762 num_examples: 1055 - name: val num_bytes: 21701 num_examples: 145 download_size: 84335 dataset_size: 194463 - config_name: or_bench_toxic features: - name: prompt dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: ood num_bytes: 56855 num_examples: 400 download_size: 25691 dataset_size: 56855 - config_name: projudgebench features: - name: question dtype: string - name: correct_answer dtype: string - name: steps list: string - name: step_to_evaluate dtype: string - name: row_id dtype: string - name: ground_truth dtype: bool splits: - name: train num_bytes: 4855444 num_examples: 1470 - name: val num_bytes: 553505 num_examples: 145 download_size: 4381635 dataset_size: 5408949 - config_name: reward_bench_2 features: - name: prompt dtype: string - name: response_A dtype: string - name: response_B dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 4981860 num_examples: 1470 - name: val num_bytes: 801043 num_examples: 145 download_size: 3184741 dataset_size: 5782903 - config_name: rod101_essay_scoring features: - name: text dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: ood num_bytes: 208033 num_examples: 102 download_size: 122077 dataset_size: 208033 - config_name: seekbench features: - name: question dtype: string - name: previous_traces dtype: string - name: current_trace dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 1643070 num_examples: 446 - name: val num_bytes: 508207 num_examples: 145 download_size: 492799 dataset_size: 2151277 - config_name: sem_eval_2010_task_8 features: - name: sentence dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 264671 num_examples: 1470 - name: val num_bytes: 26332 num_examples: 145 download_size: 146940 dataset_size: 291003 - config_name: smollm_corpus features: - name: text dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 5621908 num_examples: 1470 - name: val num_bytes: 564523 num_examples: 145 download_size: 3736465 dataset_size: 6186431 - config_name: snli features: - name: premise dtype: string - name: hypothesis dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 217970 num_examples: 1470 - name: val num_bytes: 22106 num_examples: 145 download_size: 72196 dataset_size: 240076 - config_name: toxigen_data features: - name: text dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 198955 num_examples: 1470 - name: val num_bytes: 19243 num_examples: 145 download_size: 110961 dataset_size: 218198 - config_name: tweet_eval_emotion features: - name: tweet dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 204025 num_examples: 1470 - name: val num_bytes: 19645 num_examples: 145 download_size: 127265 dataset_size: 223670 - config_name: tweet_eval_hate features: - name: tweet dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 244111 num_examples: 1470 - name: val num_bytes: 26982 num_examples: 145 download_size: 158562 dataset_size: 271093 - config_name: tweet_eval_irony features: - name: tweet dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 181569 num_examples: 1470 - name: val num_bytes: 17420 num_examples: 145 download_size: 114008 dataset_size: 198989 - config_name: tweet_eval_offensive features: - name: tweet dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 259410 num_examples: 1470 - name: val num_bytes: 26047 num_examples: 145 download_size: 146334 dataset_size: 285457 - config_name: tweet_eval_sentiment features: - name: tweet dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 228285 num_examples: 1470 - name: val num_bytes: 22462 num_examples: 145 download_size: 143674 dataset_size: 250747 - config_name: tweet_eval_stance_abortion features: - name: tweet dtype: string - name: topic dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 100174 num_examples: 587 - name: val num_bytes: 11001 num_examples: 66 download_size: 55755 dataset_size: 111175 - config_name: tweet_eval_stance_atheism features: - name: tweet dtype: string - name: topic dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 78617 num_examples: 461 - name: val num_bytes: 8856 num_examples: 52 download_size: 46935 dataset_size: 87473 - config_name: tweet_eval_stance_climate features: - name: tweet dtype: string - name: topic dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 60604 num_examples: 355 - name: val num_bytes: 6978 num_examples: 40 download_size: 37042 dataset_size: 67582 - config_name: tweet_eval_stance_feminist features: - name: tweet dtype: string - name: topic dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 102517 num_examples: 597 - name: val num_bytes: 11428 num_examples: 67 download_size: 57723 dataset_size: 113945 - config_name: tweet_eval_stance_hillary features: - name: tweet dtype: string - name: topic dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 106614 num_examples: 620 - name: val num_bytes: 11446 num_examples: 69 download_size: 55511 dataset_size: 118060 - config_name: ultrafeedback features: - name: prompt dtype: string - name: response dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 3018971 num_examples: 1470 - name: val num_bytes: 284029 num_examples: 145 download_size: 1791903 dataset_size: 3303000 - config_name: yelp features: - name: text dtype: string - name: row_id dtype: string - name: ground_truth dtype: string splits: - name: train num_bytes: 1172682 num_examples: 1470 - name: val num_bytes: 113205 num_examples: 145 download_size: 789126 dataset_size: 1285887 configs: - config_name: aes2_essay_scoring data_files: - split: train path: aes2_essay_scoring/train-* - split: val path: aes2_essay_scoring/val-* - config_name: arc_challenge data_files: - split: ood path: arc_challenge/ood-* - config_name: argument_quality_ranking data_files: - split: ood path: argument_quality_ranking/ood-* - config_name: bbh_snarks data_files: - split: ood path: bbh_snarks/ood-* - config_name: civil_comments data_files: - split: train path: civil_comments/train-* - split: val path: civil_comments/val-* - config_name: code_judge_bench data_files: - split: ood path: code_judge_bench/ood-* - config_name: colbert_humor_detection data_files: - split: train path: colbert_humor_detection/train-* - split: val path: colbert_humor_detection/val-* - config_name: gpqa_diamond data_files: - split: ood path: gpqa_diamond/ood-* - config_name: halueval_summarization data_files: - split: ood path: halueval_summarization/ood-* - config_name: hh_rlhf data_files: - split: train path: hh_rlhf/train-* - split: val path: hh_rlhf/val-* - config_name: judge_bench data_files: - split: ood path: judge_bench/ood-* - config_name: lex_glue_case_hold data_files: - split: train path: lex_glue_case_hold/train-* - split: val path: lex_glue_case_hold/val-* - config_name: lex_glue_scotus data_files: - split: train path: lex_glue_scotus/train-* - split: val path: lex_glue_scotus/val-* - config_name: medical_abstracts data_files: - split: train path: medical_abstracts/train-* - split: val path: medical_abstracts/val-* - config_name: mfrc data_files: - split: train path: mfrc/train-* - split: val path: mfrc/val-* - config_name: mmlu data_files: - split: ood path: mmlu/ood-* - config_name: mmlu_pro data_files: - split: ood path: mmlu_pro/ood-* - config_name: musr_murder_mysteries data_files: - split: ood path: musr_murder_mysteries/ood-* - config_name: musr_object_placements data_files: - split: ood path: musr_object_placements/ood-* - config_name: musr_team_allocation data_files: - split: ood path: musr_team_allocation/ood-* - config_name: or_bench_80k data_files: - split: train path: or_bench_80k/train-* - split: val path: or_bench_80k/val-* - config_name: or_bench_hard_1k data_files: - split: train path: or_bench_hard_1k/train-* - split: val path: or_bench_hard_1k/val-* - config_name: or_bench_toxic data_files: - split: ood path: or_bench_toxic/ood-* - config_name: projudgebench data_files: - split: train path: projudgebench/train-* - split: val path: projudgebench/val-* - config_name: reward_bench_2 data_files: - split: train path: reward_bench_2/train-* - split: val path: reward_bench_2/val-* - config_name: rod101_essay_scoring data_files: - split: ood path: rod101_essay_scoring/ood-* - config_name: seekbench data_files: - split: train path: seekbench/train-* - split: val path: seekbench/val-* - config_name: sem_eval_2010_task_8 data_files: - split: train path: sem_eval_2010_task_8/train-* - split: val path: sem_eval_2010_task_8/val-* - config_name: smollm_corpus data_files: - split: train path: smollm_corpus/train-* - split: val path: smollm_corpus/val-* - config_name: snli data_files: - split: train path: snli/train-* - split: val path: snli/val-* - config_name: toxigen_data data_files: - split: train path: toxigen_data/train-* - split: val path: toxigen_data/val-* - config_name: tweet_eval_emotion data_files: - split: train path: tweet_eval_emotion/train-* - split: val path: tweet_eval_emotion/val-* - config_name: tweet_eval_hate data_files: - split: train path: tweet_eval_hate/train-* - split: val path: tweet_eval_hate/val-* - config_name: tweet_eval_irony data_files: - split: train path: tweet_eval_irony/train-* - split: val path: tweet_eval_irony/val-* - config_name: tweet_eval_offensive data_files: - split: train path: tweet_eval_offensive/train-* - split: val path: tweet_eval_offensive/val-* - config_name: tweet_eval_sentiment data_files: - split: train path: tweet_eval_sentiment/train-* - split: val path: tweet_eval_sentiment/val-* - config_name: tweet_eval_stance_abortion data_files: - split: train path: tweet_eval_stance_abortion/train-* - split: val path: tweet_eval_stance_abortion/val-* - config_name: tweet_eval_stance_atheism data_files: - split: train path: tweet_eval_stance_atheism/train-* - split: val path: tweet_eval_stance_atheism/val-* - config_name: tweet_eval_stance_climate data_files: - split: train path: tweet_eval_stance_climate/train-* - split: val path: tweet_eval_stance_climate/val-* - config_name: tweet_eval_stance_feminist data_files: - split: train path: tweet_eval_stance_feminist/train-* - split: val path: tweet_eval_stance_feminist/val-* - config_name: tweet_eval_stance_hillary data_files: - split: train path: tweet_eval_stance_hillary/train-* - split: val path: tweet_eval_stance_hillary/val-* - config_name: ultrafeedback data_files: - split: train path: ultrafeedback/train-* - split: val path: ultrafeedback/val-* - config_name: yelp data_files: - split: train path: yelp/train-* - split: val path: yelp/val-* ---
提供机构:
tytodd
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作