five

ali-vosoughi/oscar-dataset

收藏
Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/ali-vosoughi/oscar-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- pretty_name: OSCaR language: - en license: other task_categories: - image-to-text task_ids: - image-captioning size_categories: - 10K<n<100K --- # OSCaR OSCaR is the public dataset release for the NAACL 2024 paper _Object State Captioning and State Change Representation_. This release packages the preserved OSCaR image assets, fine-tuning manifests, benchmark split metadata, and state-caption sidecars used around the LLaVA-based training and evaluation workflow published in the [OSCaR GitHub repository](https://github.com/nguyennm1024/OSCaR). ## Release Summary - Paper-reported scale: **14,084** annotated segments across EPIC-KITCHENS and Ego4D. - Public raw asset tree in this release: **7,742** clip directories under `data/object-state-data`. - Full preserved image-caption mapping: **30,308** rows across **7,577** clips. - LLaVA fine-tuning manifest: **28,308** image-level conversations across **7,077** clips. - Human-verified EPIC benchmark split: **2,000** rows / **500** clips / 4 caption slots. - Sidecar annotations included: **7,586** state-change JSON files, **2,244** QA JSON files, **3,142** conversation JSON files. - Open-world evaluation metadata included: **356** Ego4D records and **344** EPIC-KITCHENS records. ## What Is Included - `data/object-state-data/`: preserved OSCaR frame directories and `state_change.jpg` composites. - `manifests/llava_data.json`: OSCaR fine-tuning manifest used for adapter training. - `splits/data_mapping_final_EK_test.csv`: held-out human-verified EPIC benchmark split. - `metadata/data_mapping_final.csv`: full preserved image-to-caption mapping. - `metadata/video-object.csv`: narration-to-object/action table. - `metadata/ego4d_data.csv`: preserved Ego4D action/object metadata. - `annotations/state-change-json/`: state caption JSON sidecars. - `annotations/question-answers-clean/`: optional QA sidecars. - `annotations/conversation-clean/`: optional conversation sidecars. - `eval/openworld.json` and `eval/openworld-epic.json`: open-world evaluation prompts/metadata. ## Directory Layout ```text oscar-dataset/ data/object-state-data/ manifests/llava_data.json splits/data_mapping_final_EK_test.csv metadata/data_mapping_final.csv metadata/segment_index.csv metadata/release_summary.json annotations/state-change-json/ annotations/question-answers-clean/ annotations/conversation-clean/ eval/openworld.json eval/openworld-epic.json ``` ## Important Notes - The paper reports 14,084 annotated segments, but the preserved public asset tree in this release contains 7,742 clip directories. The released metadata keeps both the paper-scale claim and the preserved local archive counts explicit. - `metadata/segment_index.csv` is the normalized release table generated from the preserved asset tree, the full mapping CSV, the fine-tuning manifest, and the benchmark split. - Some open-world evaluation JSON records still reference original local EPIC or Ego4D frame roots. Those records are included for provenance and regeneration, not as a promise that every referenced raw frame path is redistributed here. ## Usage With OSCaR Code The public code release expects a workspace like: ```text workspace/ OSCaR/ oscar-dataset/ ``` Then run, for example: ```bash DATASET_ROOT=../oscar-dataset \ bash scripts/train/finetune_v1_5_13b_oscar_lora.sh ``` ## Provenance - Source corpora: EPIC-KITCHENS and Ego4D, as described in the paper. - Public code: `nguyennm1024/OSCaR` - Public model namespace: `ali-vosoughi` - Dataset repo: `ali-vosoughi/oscar-dataset` ## Citation ```bibtex @inproceedings{nguyen2024oscar, title={OSCaR: Object State Captioning and State Change Representation}, author={Nguyen, Nguyen and Bi, Jing and Vosoughi, Ali and Tian, Yapeng and Fazli, Pooyan and Xu, Chenliang}, booktitle={North American Chapter of the Association for Computational Linguistics (NAACL)}, year={2024} } ```
提供机构:
ali-vosoughi
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作