Rethinking Multi-Modal Alignment in Multi-Choice VideoQA from Feature and Sample Perspectives
收藏DataCite Commons2024-12-16 更新2025-04-16 收录
下载链接:
https://service.tib.eu/ldmservice/dataset/fef1046e-f787-483e-a6aa-5d874c129fa2
下载链接
链接失效反馈官方服务:
资源简介:
Reasoning about causal and temporal event relations in videos is a new destination of Video Question Answering (VideoQA). The major stumbling block to achieve this purpose is the semantic gap between language and video since they are at different levels of abstraction.
提供机构:
TIB
创建时间:
2024-12-16



