five

alessiasaporita/MissRAG

收藏
Hugging Face2026-03-30 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/alessiasaporita/MissRAG
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 tags: - multimodal - embeddings - retrieval - rag - audio - video - text - multimodal-llm --- # MissRAG Embeddings & Modality Tokens This repository provides **precomputed multimodal embeddings and modality tokens** used in the MissRAG framework: > **MissRAG: Addressing the Missing Modality Challenge in Multimodal Large Language Models** --- ## 📌 Overview We release: - 🔹 **ImageBind-based embeddings** for multimodal retrieval - 🔹 **Precomputed modality tokens** (audio/video) for efficient inference These representations are designed to: - enable **retrieval across modalities** - support **missing modality scenarios** - accelerate inference in multimodal LLMs --- ## 📁 Repository Structure The repository is organized as follows: ```bash MissRAG ├── IB_embeddings/ # ImageBind embeddings ├── modality_tokens/ │ ├── chatbridge/ # ChatBridge modality tokens │ └── onellm/ # OneLLM modality tokens ``` ### 1. Multimodal Embeddings We provide precomputed multimodal embeddings obtained using [ImageBind](https://github.com/facebookresearch/ImageBind) which: - align **audio**, **video**, and **text** modalities into a shared representation space - enable **cross-modal similarity computation** - support efficient **retrieval via inner product similarity** This unified space allows querying with any available modality to retrieve semantically related samples from missing modalities. ### 2. Modality Tokens We release precomputed modality-specific tokens for audio and video modalities. These tokens are directly compatible with: - [ChatBridge](https://github.com/CASIA-IVA-Lab/ChatBridge) - [OneLLM](https://github.com/csuhan/onellm) Precomputing modality tokens provides the following advantages: - **Computational efficiency**: eliminates redundant forward passes over training data - **Faster inference**: enables real-time retrieval-augmented generation - **Scalability**: supports large-scale retrieval without recomputing representations ## ⚙️ Usage ### 🔹 Retrieval We perform retrieval using **FAISS** for efficient nearest neighbor search in the embedding space. Given a query embedding, we retrieve the top-k most similar prototypes using inner product similarity: ```python D, I = index.search(query, k)
提供机构:
alessiasaporita
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作