Social Vision and Language Dataset (SVLD)
收藏arXiv2020-06-05 更新2024-06-21 收录
下载链接:
https://cannylab.github.io/svld
下载链接
链接失效反馈官方服务:
资源简介:
Social Vision and Language Dataset (SVLD) 是由加州大学伯克利分校创建的一个公开数据集,旨在通过提供同一上下文中的视觉和语言数据来推动多模态学习。该数据集包含来自社交媒体网站的677,181个帖子,其中包括290万张帖子图片、48.8万个帖子视频、140万条评论图片、460万条评论视频和9690万条评论。SVLD数据集的内容丰富,包括图像、视频、文本和社交数据,适用于图像字幕生成、图像分类、情感分析等多种任务。数据集的创建过程涉及使用Imgur API和开源爬虫工具进行数据收集。SVLD的应用领域广泛,旨在解决多模态学习中的长期问题,如提高模型在处理同步音频、视频和语言信息时的性能。
Social Vision and Language Dataset (SVLD) was developed by the University of California, Berkeley, as an open-access dataset designed to advance multimodal learning by offering paired visual and linguistic data within identical contextual scenarios. The dataset encompasses 677,181 posts sourced from social media platforms, including 2.9 million post images, 488,000 post videos, 1.4 million comment images, 4.6 million comment videos, and 96.9 million comments. Featuring rich content spanning images, videos, texts and social data, SVLD is applicable to a diverse array of tasks such as image captioning, image classification, sentiment analysis and more. The data collection process for SVLD utilized the Imgur API and open-source crawler tools. With a wide range of application scenarios, SVLD aims to address long-standing challenges in multimodal learning, such as enhancing model performance when processing synchronized audio, video and linguistic information.
提供机构:
加州大学伯克利分校
创建时间:
2020-06-05



