five

iapp/MMMU-Thai

收藏
Hugging Face2024-10-08 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/iapp/MMMU-Thai
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - th license: - apache-2.0 - other size_categories: - 10K<n<100K task_categories: - question-answering - visual-question-answering - multiple-choice pretty_name: mmmu thai dataset_info: - config_name: Accounting features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 262599.0 num_examples: 5 - name: validation num_bytes: 1598285.0 num_examples: 30 - name: test num_bytes: 22135625.0 num_examples: 380 download_size: 37363379 dataset_size: 23996509.0 - config_name: Agriculture features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 22082656.0 num_examples: 5 - name: validation num_bytes: 119217558.0 num_examples: 30 - name: test num_bytes: 993664077.0 num_examples: 287 download_size: 1158036990 dataset_size: 1134964291.0 - config_name: Architecture_and_Engineering features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 137750.0 num_examples: 5 - name: validation num_bytes: 721378.0 num_examples: 30 - name: test num_bytes: 16054607.0 num_examples: 551 download_size: 48763955 dataset_size: 16913735.0 - config_name: Art features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 6241184.0 num_examples: 5 - name: validation num_bytes: 29934534.0 num_examples: 30 - name: test num_bytes: 237801390.0 num_examples: 231 download_size: 585798641 dataset_size: 273977108.0 - config_name: Art_Theory features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 7435106.0 num_examples: 5 - name: validation num_bytes: 33481558.0 num_examples: 30 - name: test num_bytes: 553174647.0 num_examples: 429 download_size: 930525695 dataset_size: 594091311.0 - config_name: Basic_Medical_Science features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 814310.0 num_examples: 5 - name: validation num_bytes: 4125930.0 num_examples: 30 - name: test num_bytes: 48125891.0 num_examples: 326 download_size: 84666454 dataset_size: 53066131.0 - config_name: Biology features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 574342.0 num_examples: 5 - name: validation num_bytes: 8491863.0 num_examples: 30 - name: test num_bytes: 132966151.0 num_examples: 345 download_size: 410242502 dataset_size: 142032356.0 - config_name: Chemistry features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 262397.0 num_examples: 5 - name: validation num_bytes: 1518573.0 num_examples: 30 - name: test num_bytes: 37219529.0 num_examples: 603 download_size: 108345562 dataset_size: 39000499.0 - config_name: Clinical_Medicine features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 1467945.0 num_examples: 5 - name: validation num_bytes: 10882484.0 num_examples: 30 - name: test num_bytes: 98201863.0 num_examples: 325 download_size: 160611488 dataset_size: 110552292.0 - config_name: Computer_Science features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 440523.0 num_examples: 5 - name: validation num_bytes: 2072018.0 num_examples: 30 - name: test num_bytes: 32047381.0 num_examples: 371 download_size: 55640991 dataset_size: 34559922.0 - config_name: Design features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 2259873.0 num_examples: 5 - name: validation num_bytes: 17923120.0 num_examples: 30 - name: test num_bytes: 77676331.0 num_examples: 169 download_size: 142866617 dataset_size: 97859324.0 - config_name: Diagnostics_and_Laboratory_Medicine features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 2056117.0 num_examples: 5 - name: validation num_bytes: 37106233.0 num_examples: 30 - name: test num_bytes: 157003069.0 num_examples: 162 download_size: 603957093 dataset_size: 196165419.0 - config_name: Economics features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 171434.0 num_examples: 5 - name: validation num_bytes: 1487048.0 num_examples: 30 - name: test num_bytes: 11852300.0 num_examples: 267 download_size: 20777635 dataset_size: 13510782.0 - config_name: Electronics features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 123632.0 num_examples: 5 - name: validation num_bytes: 641377.0 num_examples: 30 - name: test num_bytes: 5717686.0 num_examples: 256 download_size: 11602832 dataset_size: 6482695.0 - config_name: Energy_and_Power features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 105006.0 num_examples: 5 - name: validation num_bytes: 1641935.0 num_examples: 30 - name: test num_bytes: 14748428.0 num_examples: 432 download_size: 35246567 dataset_size: 16495369.0 - config_name: Finance features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 296124.0 num_examples: 5 - name: validation num_bytes: 1071060.0 num_examples: 30 - name: test num_bytes: 12065803.0 num_examples: 355 download_size: 29551521 dataset_size: 13432987.0 - config_name: Geography features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 1494060.0 num_examples: 5 - name: validation num_bytes: 6671316.0 num_examples: 30 - name: test num_bytes: 137218400.0 num_examples: 565 download_size: 374766631 dataset_size: 145383776.0 - config_name: History features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 1444231.0 num_examples: 5 - name: validation num_bytes: 8819857.0 num_examples: 30 - name: test num_bytes: 115228815.0 num_examples: 278 download_size: 232549641 dataset_size: 125492903.0 - config_name: Literature features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 2451201.0 num_examples: 5 - name: validation num_bytes: 14241046.0 num_examples: 30 - name: test num_bytes: 50301541.0 num_examples: 112 download_size: 132145895 dataset_size: 66993788.0 - config_name: Manage features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 449514.0 num_examples: 5 - name: validation num_bytes: 3277436.0 num_examples: 30 - name: test num_bytes: 29963963.0 num_examples: 245 download_size: 51186888 dataset_size: 33690913.0 - config_name: Marketing features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 116960.0 num_examples: 5 - name: validation num_bytes: 1472981.0 num_examples: 30 - name: test num_bytes: 7732976.0 num_examples: 181 download_size: 13146078 dataset_size: 9322917.0 - config_name: Materials features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 239632.0 num_examples: 5 - name: validation num_bytes: 2305223.0 num_examples: 30 - name: test num_bytes: 25256854.0 num_examples: 458 download_size: 105773156 dataset_size: 27801709.0 - config_name: Math features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 175839.0 num_examples: 5 - name: validation num_bytes: 1444496.0 num_examples: 30 - name: test num_bytes: 27701845.0 num_examples: 505 download_size: 174098418 dataset_size: 29322180.0 - config_name: Mechanical_Engineering features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 152542.0 num_examples: 5 - name: validation num_bytes: 874988.0 num_examples: 30 - name: test num_bytes: 15093746.0 num_examples: 429 download_size: 30450114 dataset_size: 16121276.0 - config_name: Music features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 1417615.0 num_examples: 5 - name: validation num_bytes: 9359372.0 num_examples: 30 - name: test num_bytes: 134096770.0 num_examples: 334 download_size: 174725052 dataset_size: 144873757.0 - config_name: Pharmacy features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 207924.0 num_examples: 5 - name: validation num_bytes: 1656342.0 num_examples: 30 - name: test num_bytes: 31866248.0 num_examples: 430 download_size: 62721263 dataset_size: 33730514.0 - config_name: Physics features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 233734.0 num_examples: 5 - name: validation num_bytes: 1114130.0 num_examples: 30 - name: test num_bytes: 15905705.0 num_examples: 408 download_size: 35238571 dataset_size: 17253569.0 - config_name: Psychology features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 600864.0 num_examples: 5 - name: validation num_bytes: 4403886.0 num_examples: 30 - name: test num_bytes: 53813915.0 num_examples: 305 download_size: 102466671 dataset_size: 58818665.0 - config_name: Public_Health features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 234781.0 num_examples: 5 - name: validation num_bytes: 1508761.0 num_examples: 30 - name: test num_bytes: 32150088.0 num_examples: 509 download_size: 48231609 dataset_size: 33893630.0 - config_name: Sociology features: - name: id dtype: string - name: question dtype: string - name: options dtype: string - name: explanation dtype: string - name: image_1 dtype: image - name: image_2 dtype: image - name: image_3 dtype: image - name: image_4 dtype: image - name: image_5 dtype: image - name: image_6 dtype: image - name: image_7 dtype: image - name: img_type dtype: string - name: answer dtype: string - name: topic_difficulty dtype: string - name: question_type dtype: string - name: subfield dtype: string splits: - name: dev num_bytes: 3769220.0 num_examples: 5 - name: validation num_bytes: 18455336.0 num_examples: 30 - name: test num_bytes: 144301123.0 num_examples: 252 download_size: 310313826 dataset_size: 166525679.0 configs: - config_name: Accounting data_files: - split: dev path: Accounting/dev-* - split: validation path: Accounting/validation-* - split: test path: Accounting/test-* - config_name: Agriculture data_files: - split: dev path: Agriculture/dev-* - split: validation path: Agriculture/validation-* - split: test path: Agriculture/test-* - config_name: Architecture_and_Engineering data_files: - split: dev path: Architecture_and_Engineering/dev-* - split: validation path: Architecture_and_Engineering/validation-* - split: test path: Architecture_and_Engineering/test-* - config_name: Art data_files: - split: dev path: Art/dev-* - split: validation path: Art/validation-* - split: test path: Art/test-* - config_name: Art_Theory data_files: - split: dev path: Art_Theory/dev-* - split: validation path: Art_Theory/validation-* - split: test path: Art_Theory/test-* - config_name: Basic_Medical_Science data_files: - split: dev path: Basic_Medical_Science/dev-* - split: validation path: Basic_Medical_Science/validation-* - split: test path: Basic_Medical_Science/test-* - config_name: Biology data_files: - split: dev path: Biology/dev-* - split: validation path: Biology/validation-* - split: test path: Biology/test-* - config_name: Chemistry data_files: - split: dev path: Chemistry/dev-* - split: validation path: Chemistry/validation-* - split: test path: Chemistry/test-* - config_name: Clinical_Medicine data_files: - split: dev path: Clinical_Medicine/dev-* - split: validation path: Clinical_Medicine/validation-* - split: test path: Clinical_Medicine/test-* - config_name: Computer_Science data_files: - split: dev path: Computer_Science/dev-* - split: validation path: Computer_Science/validation-* - split: test path: Computer_Science/test-* - config_name: Design data_files: - split: dev path: Design/dev-* - split: validation path: Design/validation-* - split: test path: Design/test-* - config_name: Diagnostics_and_Laboratory_Medicine data_files: - split: dev path: Diagnostics_and_Laboratory_Medicine/dev-* - split: validation path: Diagnostics_and_Laboratory_Medicine/validation-* - split: test path: Diagnostics_and_Laboratory_Medicine/test-* - config_name: Economics data_files: - split: dev path: Economics/dev-* - split: validation path: Economics/validation-* - split: test path: Economics/test-* - config_name: Electronics data_files: - split: dev path: Electronics/dev-* - split: validation path: Electronics/validation-* - split: test path: Electronics/test-* - config_name: Energy_and_Power data_files: - split: dev path: Energy_and_Power/dev-* - split: validation path: Energy_and_Power/validation-* - split: test path: Energy_and_Power/test-* - config_name: Finance data_files: - split: dev path: Finance/dev-* - split: validation path: Finance/validation-* - split: test path: Finance/test-* - config_name: Geography data_files: - split: dev path: Geography/dev-* - split: validation path: Geography/validation-* - split: test path: Geography/test-* - config_name: History data_files: - split: dev path: History/dev-* - split: validation path: History/validation-* - split: test path: History/test-* - config_name: Literature data_files: - split: dev path: Literature/dev-* - split: validation path: Literature/validation-* - split: test path: Literature/test-* - config_name: Manage data_files: - split: dev path: Manage/dev-* - split: validation path: Manage/validation-* - split: test path: Manage/test-* - config_name: Marketing data_files: - split: dev path: Marketing/dev-* - split: validation path: Marketing/validation-* - split: test path: Marketing/test-* - config_name: Materials data_files: - split: dev path: Materials/dev-* - split: validation path: Materials/validation-* - split: test path: Materials/test-* - config_name: Math data_files: - split: dev path: Math/dev-* - split: validation path: Math/validation-* - split: test path: Math/test-* - config_name: Mechanical_Engineering data_files: - split: dev path: Mechanical_Engineering/dev-* - split: validation path: Mechanical_Engineering/validation-* - split: test path: Mechanical_Engineering/test-* - config_name: Music data_files: - split: dev path: Music/dev-* - split: validation path: Music/validation-* - split: test path: Music/test-* - config_name: Pharmacy data_files: - split: dev path: Pharmacy/dev-* - split: validation path: Pharmacy/validation-* - split: test path: Pharmacy/test-* - config_name: Physics data_files: - split: dev path: Physics/dev-* - split: validation path: Physics/validation-* - split: test path: Physics/test-* - config_name: Psychology data_files: - split: dev path: Psychology/dev-* - split: validation path: Psychology/validation-* - split: test path: Psychology/test-* - config_name: Public_Health data_files: - split: dev path: Public_Health/dev-* - split: validation path: Public_Health/validation-* - split: test path: Public_Health/test-* - config_name: Sociology data_files: - split: dev path: Sociology/dev-* - split: validation path: Sociology/validation-* - split: test path: Sociology/test-* tags: - biology - medical - finance - chemistry - music - art - art_theory - design - music - business - accounting - economics - finance - manage - marketing - health - medicine - basic_medical_science - clinical - pharmacy - public_health - humanities - social_science - history - literature - sociology - psychology - science - biology - chemistry - geography - math - physics - engineering - agriculture - architecture - computer_science - electronics - energy_and_power - materials - mechanical_engineering --- # MMMU Thai (MMMU Benchmark Translated to Thai) MMMU Thai is a dataset for evaluating multimodal models on massive multi-discipline tasks requiring college-level knowledge and deliberate reasoning. This dataset is translated from MMMU (A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI) into Thai. ## Dataset Details MMMU Thai consists of 11,500 meticulously collected multimodal questions from college exams, quizzes, and textbooks, covering six core disciplines: 1. Art & Design 2. Business 3. Science 4. Health & Medicine 5. Humanities & Social Science 6. Tech & Engineering These questions span 30 subjects and 183 subfields, comprising 30 highly heterogeneous image types, such as charts, diagrams, maps, tables, music sheets, and chemical structures. ## Dataset Structure MMMU Thai includes: - Development set: 150 samples - Validation set: 900 samples - Test set: 10,500 questions (without answers) The development set is used for few-shot/in-context learning, and the validation set is used for debugging models, selecting hyperparameters, or quick evaluations. The answers and explanations for the test set questions are withheld. ## How We Built This Dataset This dataset was automatically translated to Thai using [openthaigpt1.5-72b](https://huggingface.co/openthaigpt/openthaigpt1.5-72b-instruct) on the columns: question, options, and explanation, ensuring consistency between options and answers using human. Improved using Qwen. ## LICENSE This dataset is dual-licensed under Apache License 2.0 and the Qwen LICENSE Agreement. The original MMMU dataset is licensed under Apache 2.0, while the improvements made using Qwen-derived models are subject to the Qwen LICENSE Agreement. ## Maintainer Kobkrit Viriyayudhakorn (kobkrit@iapp.co.th) ## References Original dataset: [MMMU Dataset](https://huggingface.co/datasets/MMMU/MMMU)
提供机构:
iapp
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作