alessiasaporita/MissRAG

Name: alessiasaporita/MissRAG
Creator: alessiasaporita
Published: 2026-03-30 10:23:59
License: 暂无描述

Hugging Face2026-03-30 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/alessiasaporita/MissRAG

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 tags: - multimodal - embeddings - retrieval - rag - audio - video - text - multimodal-llm --- # MissRAG Embeddings & Modality Tokens This repository provides **precomputed multimodal embeddings and modality tokens** used in the MissRAG framework: > **MissRAG: Addressing the Missing Modality Challenge in Multimodal Large Language Models** --- ## 📌 Overview We release: - 🔹 **ImageBind-based embeddings** for multimodal retrieval - 🔹 **Precomputed modality tokens** (audio/video) for efficient inference These representations are designed to: - enable **retrieval across modalities** - support **missing modality scenarios** - accelerate inference in multimodal LLMs --- ## 📁 Repository Structure The repository is organized as follows: ```bash MissRAG ├── IB_embeddings/ # ImageBind embeddings ├── modality_tokens/ │ ├── chatbridge/ # ChatBridge modality tokens │ └── onellm/ # OneLLM modality tokens ``` ### 1. Multimodal Embeddings We provide precomputed multimodal embeddings obtained using [ImageBind](https://github.com/facebookresearch/ImageBind) which: - align **audio**, **video**, and **text** modalities into a shared representation space - enable **cross-modal similarity computation** - support efficient **retrieval via inner product similarity** This unified space allows querying with any available modality to retrieve semantically related samples from missing modalities. ### 2. Modality Tokens We release precomputed modality-specific tokens for audio and video modalities. These tokens are directly compatible with: - [ChatBridge](https://github.com/CASIA-IVA-Lab/ChatBridge) - [OneLLM](https://github.com/csuhan/onellm) Precomputing modality tokens provides the following advantages: - **Computational efficiency**: eliminates redundant forward passes over training data - **Faster inference**: enables real-time retrieval-augmented generation - **Scalability**: supports large-scale retrieval without recomputing representations ## ⚙️ Usage ### 🔹 Retrieval We perform retrieval using **FAISS** for efficient nearest neighbor search in the embedding space. Given a query embedding, we retrieve the top-k most similar prototypes using inner product similarity: ```python D, I = index.search(query, k)

提供机构：

alessiasaporita

5,000+

优质数据集

54 个

任务类型

进入经典数据集