five

mlvbench-review/MLV-Bench

收藏
Hugging Face2026-04-27 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/mlvbench-review/MLV-Bench
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: other pretty_name: MLV-Bench language: - en tags: - medical-video-understanding - long-context-video - multimodal-large-language-models - benchmark - visual-question-answering - croissant size_categories: - 100<n<1K --- # MLV-Bench MLV-Bench is a benchmark for long-context medical video understanding in the wild. It contains 340 public full-procedure medical videos from 8 public sources, totaling 759 decoded hours, and 1,253 verified multiple-choice questions. ## Files - `mlvbench.jsonl`: official benchmark metadata and QA records. Each line is one video record with nested QA items. - `sample/`: reviewer-facing representative sample with four videos and a same-schema JSONL file. - `mlvbench_croissant.json`: Croissant metadata with Responsible AI fields for NeurIPS 2026 review. ## Schema Each JSONL line contains `key`, `dataset`, `organ`, `scene_type`, `duration_tier`, `video_path`, `num_frames`, `fps`, `duration_seconds`, and `qa`. Each QA item contains `uid`, `question`, `options`, `answer`, `task_id`, `task_name`, `task_class`, `category`, `question_type`, and optional hop metadata. ## Intended use This dataset is intended for research evaluation of multimodal models on long-context medical video understanding, sparse evidence retrieval, and multi-hop reasoning. It is not intended for clinical diagnosis, patient management, or deployment. ## Representative sample Because the complete dataset is larger than 4 GB, the `sample/` folder provides a reviewer-accessible subset. The sample is stratified by clinical scene type and includes surgery, gastrointestinal endoscopy, colonoscopy, and ultrasound examples. It is for data-quality inspection only and is not the official evaluation split. ## Licensing and source terms MLV-Bench is derived from multiple public medical video datasets. Source-specific licenses and usage terms apply to the corresponding source data. Users must comply with all original dataset licenses and privacy terms. ## Responsible AI notes The benchmark uses public medical procedure videos and does not intentionally include direct patient identifiers in the benchmark JSONL. However, clinical videos are human-subject medical data and may contain residual source metadata or overlays. Users must not attempt re-identification and must use the dataset only for approved research evaluation.
提供机构:
mlvbench-review
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作