vSTS
收藏arXiv2018-09-11 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/1809.03695v1
下载链接
链接失效反馈官方服务:
资源简介:
vSTS数据集由巴斯克地区大学的IXA NLP Group创建,旨在通过结合图像与文本信息来评估句子相似性。该数据集包含819个实例,每个实例包括一对图像及其描述,以及一个从0到5的相似度评分。数据集来源于PASCAL VOC-2008和8k-Flickr数据集的子集,通过筛选确保每对图像不同。创建过程中,尽管人类标注者仅基于文本进行标注,但数据集设计允许系统同时访问图像和文本,以测试视觉信息对文本理解任务的贡献。vSTS数据集主要应用于研究图像与文本结合对句子相似性评估的影响,旨在解决多模态表示在自然语言处理中的应用问题。
The vSTS dataset was developed by the IXA NLP Group at the University of the Basque Country, with the objective of evaluating sentence similarity by integrating image and text information. It comprises 819 instances, each containing a pair of images, their respective descriptions, and a similarity score ranging from 0 to 5. The dataset is sourced from subsets of the PASCAL VOC-2008 and 8k-Flickr datasets, with filtering applied to ensure that each image pair is distinct. During its creation, although human annotators performed annotations solely based on text, the dataset is designed to allow systems to access both images and their corresponding text, so as to test the contribution of visual information to text understanding tasks. The vSTS dataset is primarily used to study the impact of combining images and text on sentence similarity evaluation, aiming to address the application issues of multimodal representations in natural language processing.
提供机构:
巴斯克地区大学
创建时间:
2018-09-11



