five

happy8825/MMLongBench_var6_deterministic

收藏
Hugging Face2025-12-14 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/happy8825/MMLongBench_var6_deterministic
下载链接
链接失效反馈
官方服务:
资源简介:
<!-- SIMPLEDOC_AUTO_SUMMARIES_START --> ### MMLongBench – 2025-12-14 20:07 UTC ``` Average accuracy: 48.79% (1072 samples with scores) Subset metrics by evidence source: Pure-text (Plain-text): samples=302, accuracy=48.01% Figure: samples=299, accuracy=37.79% Table: samples=217, accuracy=41.94% Chart: samples=175, accuracy=43.43% Generalized-text (Layout): samples=119, accuracy=33.61% Subset metrics by evidence pages length: no_pages: samples=226, accuracy=65.93% single_page: samples=489, accuracy=53.78% multiple_pages: samples=357, accuracy=31.09% Done: Results saved to /hub_data2/seohyun/outputs/var6_deepeyes_multiimage_deterministic/simpledoc_eval/MMLongBench/eval_results.jsonl Results source: /hub_data2/seohyun/outputs/var6_deepeyes_multiimage_deterministic/results.json ``` --- ### MMLongBench – 2025-12-14 11:00 UTC ``` Average accuracy: 60.00% (10 samples with scores) Subset metrics by evidence source: Chart: samples=5, accuracy=60.00% Pure-text (Plain-text): samples=2, accuracy=0.00% Generalized-text (Layout): samples=1, accuracy=0.00% Table: samples=1, accuracy=100.00% Subset metrics by evidence pages length: no_pages: samples=2, accuracy=100.00% single_page: samples=4, accuracy=75.00% multiple_pages: samples=4, accuracy=25.00% Done: Results saved to /hub_data2/seohyun/outputs/var6_deepeyes_multiimage_deterministic/simpledoc_eval/MMLongBench/eval_results.jsonl Results source: /hub_data2/seohyun/outputs/var6_deepeyes_multiimage_deterministic/results.json ``` <!-- SIMPLEDOC_AUTO_SUMMARIES_END --> --- configs: - config_name: default data_files: - split: train path: data/train-* dataset_info: features: - name: relevant_pages list: int64 - name: evidence_pages list: int64 - name: score dtype: int64 - name: doc_id dtype: string - name: doc_type dtype: string - name: question dtype: string - name: answer dtype: string - name: evidence_sources list: string - name: final_answer dtype: string - name: turn1_colqwen_query dtype: 'null' - name: turn1_colqwen_retrieval_results dtype: 'null' - name: turn1_llm_query_input dtype: 'null' - name: turn1_llm_retrieval_results struct: - name: document_summary dtype: string - name: relevant_pages list: int64 - name: turn1_llm_raw_output dtype: string - name: turn1_memory_out dtype: string - name: turn2_memory_in dtype: string - name: turn2_vlm_prompt_input dtype: string - name: turn2_vlm_raw_output dtype: string - name: turn2_final_answer dtype: string - name: turn2_response_type dtype: string - name: turn2_updated_question dtype: 'null' - name: turn2_notes dtype: 'null' - name: turn2_vlm_turn1_input_image0_ref dtype: string - name: turn2_vlm_turn1_input_image10_ref dtype: string - name: turn2_vlm_turn1_input_image11_ref dtype: string - name: turn2_vlm_turn1_input_image12_ref dtype: string - name: turn2_vlm_turn1_input_image13_ref dtype: string - name: turn2_vlm_turn1_input_image14_ref dtype: string - name: turn2_vlm_turn1_input_image15_ref dtype: string - name: turn2_vlm_turn1_input_image16_ref dtype: string - name: turn2_vlm_turn1_input_image17_ref dtype: string - name: turn2_vlm_turn1_input_image18_ref dtype: string - name: turn2_vlm_turn1_input_image19_ref dtype: string - name: turn2_vlm_turn1_input_image1_ref dtype: string - name: turn2_vlm_turn1_input_image20_ref dtype: string - name: turn2_vlm_turn1_input_image21_ref dtype: string - name: turn2_vlm_turn1_input_image22_ref dtype: string - name: turn2_vlm_turn1_input_image23_ref dtype: string - name: turn2_vlm_turn1_input_image24_ref dtype: string - name: turn2_vlm_turn1_input_image25_ref dtype: string - name: turn2_vlm_turn1_input_image26_ref dtype: string - name: turn2_vlm_turn1_input_image27_ref dtype: string - name: turn2_vlm_turn1_input_image28_ref dtype: string - name: turn2_vlm_turn1_input_image29_ref dtype: string - name: turn2_vlm_turn1_input_image2_ref dtype: string - name: turn2_vlm_turn1_input_image3_ref dtype: string - name: turn2_vlm_turn1_input_image4_ref dtype: string - name: turn2_vlm_turn1_input_image5_ref dtype: string - name: turn2_vlm_turn1_input_image6_ref dtype: string - name: turn2_vlm_turn1_input_image7_ref dtype: string - name: turn2_vlm_turn1_input_image8_ref dtype: string - name: turn2_vlm_turn1_input_image9_ref dtype: string - name: turn2_vlm_turn1_input_messages list: - name: content list: - name: ref dtype: string - name: text dtype: string - name: type dtype: string - name: role dtype: string - name: turn2_vlm_turn1_prompt dtype: string - name: turn2_vlm_turn1_raw_output dtype: string - name: turn2_vlm_turn1_zoom_box list: float64 - name: turn2_vlm_turn1_zoom_page_index dtype: int64 - name: turn2_vlm_turn2_input_image0_ref dtype: string - name: turn2_vlm_turn2_input_image10_ref dtype: string - name: turn2_vlm_turn2_input_image11_ref dtype: string - name: turn2_vlm_turn2_input_image12_ref dtype: string - name: turn2_vlm_turn2_input_image13_ref dtype: string - name: turn2_vlm_turn2_input_image14_ref dtype: string - name: turn2_vlm_turn2_input_image15_ref dtype: string - name: turn2_vlm_turn2_input_image16_ref dtype: string - name: turn2_vlm_turn2_input_image17_ref dtype: string - name: turn2_vlm_turn2_input_image18_ref dtype: string - name: turn2_vlm_turn2_input_image19_ref dtype: string - name: turn2_vlm_turn2_input_image1_ref dtype: string - name: turn2_vlm_turn2_input_image20_ref dtype: string - name: turn2_vlm_turn2_input_image21_ref dtype: string - name: turn2_vlm_turn2_input_image22_ref dtype: string - name: turn2_vlm_turn2_input_image23_ref dtype: string - name: turn2_vlm_turn2_input_image24_ref dtype: string - name: turn2_vlm_turn2_input_image2_ref dtype: string - name: turn2_vlm_turn2_input_image3_ref dtype: string - name: turn2_vlm_turn2_input_image4_ref dtype: string - name: turn2_vlm_turn2_input_image5_ref dtype: string - name: turn2_vlm_turn2_input_image6_ref dtype: string - name: turn2_vlm_turn2_input_image7_ref dtype: string - name: turn2_vlm_turn2_input_image8_ref dtype: string - name: turn2_vlm_turn2_input_image9_ref dtype: string - name: turn2_vlm_turn2_input_messages list: - name: content list: - name: ref dtype: string - name: text dtype: string - name: type dtype: string - name: role dtype: string - name: turn2_vlm_turn2_prompt dtype: string - name: turn2_vlm_turn2_raw_output dtype: string - name: turn2_vlm_turn2_zoom_box list: float64 - name: turn2_vlm_turn2_zoom_page_index dtype: int64 - name: turn2_vlm_turn3_input_image0_ref dtype: string - name: turn2_vlm_turn3_input_image10_ref dtype: string - name: turn2_vlm_turn3_input_image11_ref dtype: string - name: turn2_vlm_turn3_input_image12_ref dtype: string - name: turn2_vlm_turn3_input_image13_ref dtype: string - name: turn2_vlm_turn3_input_image1_ref dtype: string - name: turn2_vlm_turn3_input_image2_ref dtype: string - name: turn2_vlm_turn3_input_image3_ref dtype: string - name: turn2_vlm_turn3_input_image4_ref dtype: string - name: turn2_vlm_turn3_input_image5_ref dtype: string - name: turn2_vlm_turn3_input_image6_ref dtype: string - name: turn2_vlm_turn3_input_image7_ref dtype: string - name: turn2_vlm_turn3_input_image8_ref dtype: string - name: turn2_vlm_turn3_input_image9_ref dtype: string - name: turn2_vlm_turn3_input_messages list: - name: content list: - name: ref dtype: string - name: text dtype: string - name: type dtype: string - name: role dtype: string - name: turn2_vlm_turn3_prompt dtype: string - name: turn2_vlm_turn3_raw_output dtype: string - name: turn2_vlm_turn3_zoom_box list: float64 - name: turn2_vlm_turn3_zoom_page_index dtype: int64 - name: turn2_vlm_turn4_input_image0_ref dtype: string - name: turn2_vlm_turn4_input_image1_ref dtype: string - name: turn2_vlm_turn4_input_image2_ref dtype: string - name: turn2_vlm_turn4_input_image3_ref dtype: string - name: turn2_vlm_turn4_input_image4_ref dtype: string - name: turn2_vlm_turn4_input_image5_ref dtype: string - name: turn2_vlm_turn4_input_image6_ref dtype: string - name: turn2_vlm_turn4_input_messages list: - name: content list: - name: ref dtype: string - name: text dtype: string - name: type dtype: string - name: role dtype: string - name: turn2_vlm_turn4_prompt dtype: string - name: turn2_vlm_turn4_raw_output dtype: string splits: - name: train num_bytes: 19372252 num_examples: 1073 download_size: 4302220 dataset_size: 19372252 --- --- dataset_info: features: - name: relevant_pages list: int64 - name: evidence_pages list: int64 - name: score dtype: int64 - name: doc_id dtype: string - name: doc_type dtype: string - name: question dtype: string - name: answer dtype: string - name: evidence_sources list: string - name: final_answer dtype: string - name: turn1_colqwen_query dtype: 'null' - name: turn1_colqwen_retrieval_results dtype: 'null' - name: turn1_llm_query_input dtype: 'null' - name: turn1_llm_retrieval_results struct: - name: document_summary dtype: string - name: relevant_pages list: int64 - name: turn1_llm_raw_output dtype: string - name: turn1_memory_out dtype: string - name: turn2_memory_in dtype: string - name: turn2_vlm_prompt_input dtype: string - name: turn2_vlm_raw_output dtype: string - name: turn2_final_answer dtype: string - name: turn2_response_type dtype: string - name: turn2_updated_question dtype: 'null' - name: turn2_notes dtype: 'null' - name: turn2_vlm_turn1_input_image0_ref dtype: string - name: turn2_vlm_turn1_input_image1_ref dtype: string - name: turn2_vlm_turn1_input_image2_ref dtype: string - name: turn2_vlm_turn1_input_image3_ref dtype: string - name: turn2_vlm_turn1_input_image4_ref dtype: string - name: turn2_vlm_turn1_input_messages list: - name: content list: - name: ref dtype: string - name: text dtype: string - name: type dtype: string - name: role dtype: string - name: turn2_vlm_turn1_prompt dtype: string - name: turn2_vlm_turn1_raw_output dtype: string - name: turn2_vlm_turn1_zoom_box list: float64 - name: turn2_vlm_turn1_zoom_page_index dtype: int64 - name: turn2_vlm_turn2_input_image0_ref dtype: string - name: turn2_vlm_turn2_input_image1_ref dtype: string - name: turn2_vlm_turn2_input_image2_ref dtype: string - name: turn2_vlm_turn2_input_messages list: - name: content list: - name: ref dtype: string - name: text dtype: string - name: type dtype: string - name: role dtype: string - name: turn2_vlm_turn2_prompt dtype: string - name: turn2_vlm_turn2_raw_output dtype: string - name: turn2_vlm_turn2_zoom_box list: float64 - name: turn2_vlm_turn2_zoom_page_index dtype: int64 - name: turn2_vlm_turn3_input_image0_ref dtype: string - name: turn2_vlm_turn3_input_image1_ref dtype: string - name: turn2_vlm_turn3_input_image2_ref dtype: string - name: turn2_vlm_turn3_input_image3_ref dtype: string - name: turn2_vlm_turn3_input_messages list: - name: content list: - name: ref dtype: string - name: text dtype: string - name: type dtype: string - name: role dtype: string - name: turn2_vlm_turn3_prompt dtype: string - name: turn2_vlm_turn3_raw_output dtype: string splits: - name: train num_bytes: 199629 num_examples: 10 download_size: 114463 dataset_size: 199629 configs: - config_name: default data_files: - split: train path: data/train-* ---
提供机构:
happy8825
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作