HARRISON 社交媒体图像数据集
收藏帕依提提2024-03-04 收录
下载链接:
https://www.payititi.com/opendatasets/show-26412.html
下载链接
链接失效反馈官方服务:
资源简介:
HARRISON 是一个对 Facebook、Twitter、Instagram 社交媒体中的图片进行标签标注(tag)的图像数据集,包括 57383 张图片,每张图片平均 4.5 个标签标注,标签标注来自出现频率最高的 1000 个单词。 主题标签定义为附加在前缀字符“#”上的任何单词,该单词在在线社交网络服务(SNS)(例如Facebook,Twitter和Instagram)中使用。随着在线社交网络的增长,标签通常用于总结用户帖子的内容并吸引关注者的注意。在社交媒体时代,推荐适当的标签是一项非常有趣和有用的任务。 我们介绍了用于图像主题标签推荐的新颖基准,称为HARRISON,或针对社交网络中的真实世界图像的HAshtag建议。HARRISON数据集是一个现实的数据集,由来自Instagram的57,383张照片和每张照片的平均4.5个相关主题标签(最少1个,最多10个)组成。每个图像的地面真相标签由1,000个最常用的标签组成,并根据频率排名结果以数字进行编码。
HARRISON is an image dataset for hashtag annotation of posts on social media platforms including Facebook, Twitter, and Instagram. It comprises 57,383 images, with an average of 4.5 hashtags per image. The hashtag vocabulary for this dataset is sourced from the 1000 most frequently used terms in social media contexts. Hashtags are defined as any word prefixed with the "#" symbol, which are commonly employed in online social network services (SNS) such as Facebook, Twitter, and Instagram. With the growth of online social networks, hashtags have become a prevalent tool for summarizing the content of user posts and attracting the attention of followers. In the social media era, recommending appropriate hashtags is an intriguing and practically valuable task. We present a novel benchmark for image hashtag recommendation called HARRISON, an acronym for HAshtag Recommendation for real-world images in Social Networks. The HARRISON dataset is a realistic corpus composed of 57,383 photos sourced from Instagram, with an average of 4.5 associated ground-truth hashtags per photo, ranging from a minimum of 1 to a maximum of 10. The ground-truth hashtags for each image are selected from the aforementioned 1000-term vocabulary, and are encoded numerically based on their frequency-based ranking results.
提供机构:
帕依提提
搜集汇总
数据集介绍

背景与挑战
背景概述
HARRISON社交媒体图像数据集包含来自三大社交平台的57383张图片,每张图片平均标注4.5个高频标签,适用于图像标签推荐研究。
以上内容由遇见数据集搜集并总结生成



