albertge/datacomp-small-subset-500000
收藏Hugging Face2025-02-07 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/albertge/datacomp-small-subset-500000
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个包含用户ID、URL、文本内容、图片尺寸、相似度评分、面部边界框、哈希值、嵌入向量和聚类标签的集合。它适用于图像和文本分析任务,如相似度检测、面部识别和聚类分析。训练集包含50万个示例,整个数据集大小超过3GB。
The dataset is a collection that includes user ID, URL, text content, image dimensions, similarity scores, facial bounding boxes, hash values, embedding vectors, and cluster labels. It is suitable for image and text analysis tasks such as similarity detection, face recognition, and clustering analysis. The training set contains 500,000 examples, and the entire dataset is over 3GB in size.
提供机构:
albertge



