OpenOneRec/openonerec_multimodal_embedding
收藏Hugging Face2026-03-17 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/OpenOneRec/openonerec_multimodal_embedding
下载链接
链接失效反馈官方服务:
资源简介:
# Original Data
## Basic Information
Table schema:
- `pid BIGINT`
- `vision_emb ARRAY<ARRAY<DOUBLE>>`
- `text_emb ARRAY<DOUBLE>`
Notes:
- `vision_emb` is the image embedding
- `text_emb` is the text embedding
## Coverage Summary
The table currently contains `17,433,569` pids in total.
Breakdown:
- Pids with both image and text embeddings: `15,647,227`
- Pids with image embedding only: `1,411,004`
- Pids with text embedding only: `375,338`
In short:
- Most pids already have complete multimodal coverage
- A smaller portion has image-only embeddings
- A smaller portion has text-only embeddings
# 原始数据
## 基本信息
表结构:
- `pid`:BIGINT 类型
- `vision_emb`:ARRAY<ARRAY<DOUBLE>> 类型
- `text_emb`:ARRAY<DOUBLE> 类型
备注:
- `vision_emb` 为图像嵌入向量
- `text_emb` 为文本嵌入向量
## 覆盖范围概述
该表目前总计包含17,433,569条pid数据。
细分情况:
- 同时拥有图像嵌入与文本嵌入的pid数量:15,647,227
- 仅拥有图像嵌入的pid数量:1,411,004
- 仅拥有文本嵌入的pid数量:375,338
简言之:
- 绝大多数pid已具备完整的多模态覆盖
- 仅拥有图像嵌入的pid占比较小
- 仅拥有文本嵌入的pid占比较小
提供机构:
OpenOneRec



