amathislab/LEMONADE

Name: amathislab/LEMONADE
Creator: amathislab
Published: 2025-10-20 12:55:43
License: 暂无描述

Hugging Face2025-10-20 更新2026-01-03 收录

下载链接：

https://hf-mirror.com/datasets/amathislab/LEMONADE

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en license: cc-by-4.0 size_categories: - 10K<n<100K task_categories: - question-answering - video-text-to-text tags: - behavior - motion - human - egocentric - language - llm - vlm - esk pretty_name: Lemonade --- # 🍋 EPFL-Smart-Kitchen: Lemonade benchmark [Paper](https://huggingface.co/papers/2506.01608) | [GitHub](https://github.com/amathislab/EPFL-Smart-Kitchen) ![title](media/title.svg) ## 📚 Introduction we introduce Lemonade: **L**anguage models **E**valuation of **MO**tion a**N**d **A**ction-**D**riven **E**nquiries. Lemonade consists of 36,521 closed-ended QA pairs linked to egocentric video clips, categorized in three groups and six subcategories. 18,857 QAs focus on behavior understanding, leveraging the rich ground truth behavior annotations of the EPFL-Smart Kitchen to interrogate models about perceived actions (Perception) and reason over unseen behaviors (Reasoning). 8,210 QAs involve longer video clips, challenging models in summarization (Summarization) and session-level inference (Session properties). The remaining 9,463 QAs leverage the 3D pose estimation data to infer hand shapes, joint angles (Physical attributes), or trajectory velocities (Kinematics) from visual information. ## 💾 Content The current repository contains all egocentric videos recorded in the EPFL-Smart-Kitchen-30 dataset and the question answer pairs of the Lemonade benchmark. Please refer to the [main GitHub repository](https://github.com/amathislab/EPFL-Smart-Kitchen) to find the other benchmarks and links to download other modalities of the EPFL-Smart-Kitchen-30 dataset. ### 🗃️ Repository structure ``` Lemonade ├── MCQs | └── lemonade_benchmark.csv ├── videos | ├── YH2002_2023_12_04_10_15_23_hololens.mp4 | └── .. └── README.md ``` `lemonade_benchmark.csv` : Table with the following fields: **Question** : Question to be answered. **QID** : Question identifier, an integer from 0 to 30. **Answers** : A list of possible answers to the question. This can be a multiple-choice set or open-ended responses. **Correct Answer** : The answer that is deemed correct from the list of provided answers. **Clip** : A reference to the video clip related to the question. **Start** : The timestamp (in frames) in the clip where the question context begins. **End** : The timestamp (in frames) in the clip where the question context ends. **Category** : The broad topic under which the question falls (Behavior understanding, Long-term understanding or Motion and Biomechanics). **Subcategory** : A more refined classification within the category (Perception, Reasoning, Summarization, Session properties, Physical attributes, Kinematics). **Difficulty** : The complexity level of the question (e.g., Easy, Medium, Hard). `videos` : Folder with all egocentric videos from the EPFL-Smart-Kitchen-30 benchmark. Video names are structured as `[Participant_ID]_[Session_name]_hololens.mp4`. > We refer the reader to the associated publication for details about data processing and tasks description. ## 📈 Evaluation results ![evaluation_results](media/evaluation_results.svg) ## 🌈 Usage The evaluation of the benchmark can be done through the following github repository: [https://github.com/amathislab/lmms-eval-lemonade](https://github.com/amathislab/lmms-eval-lemonade) ## 🌟 Citations Please cite our work! ``` @misc{bonnetto2025epflsmartkitchen, title={EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language models}, author={Andy Bonnetto and Haozhe Qi and Franklin Leong and Matea Tashkovska and Mahdi Rad and Solaiman Shokur and Friedhelm Hummel and Silvestro Micera and Marc Pollefeys and Alexander Mathis}, year={2025}, eprint={2506.01608}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2506.01608}, } ``` ## ❤️ Acknowledgments Our work was funded by EPFL, Swiss SNF grant (320030-227871), Microsoft Swiss Joint Research Center and a Boehringer Ingelheim Fonds PhD stipend (H.Q.). We are grateful to the Brain Mind Institute for providing funds for hardware and to the Neuro-X Institute for providing funds for services.

提供机构：

amathislab

5,000+

优质数据集

54 个

任务类型

进入经典数据集