ARTigo: Social Image Tagging (Aggregated Data)
收藏Mendeley Data2024-06-29 更新2024-06-28 收录
下载链接:
https://zenodo.org/record/7882689
下载链接
链接失效反馈官方服务:
资源简介:
ARTigo (https://www.artigo.org/) is a Citizen Science project that has been jointly developed at the Institute for Art History and the Institute for Informatics at Ludwig Maximilian University of Munich since 2010. It enables participants to engage in the tagging of artworks, thus fostering knowledge accumulation and democratizing access to a traditionally elitist field. ARTigo is built as an interactive web application that offers Games With a Purpose: in them, players are presented with an image – and then challenged to communicate with one another using visual or textual annotations within a given time. Through this playful approach, the project aims to inspire greater appreciation for art and draw new audiences to museums and archives. It streamlines the discoverability of art-historical images, while promoting inclusivity, effective communication, and collaborative research practices. The project’s data are freely available to the wider research community for novel scientific investigations. File structure The dataset is provided in a .jsonl file format, with each line representing a single image and its associated metadata. The images themselves are provided separately in a .zip file. data.jsonl: Each line in the .jsonl file represents a single image and its associated metadata, and has the following key-value pairs: id: a unique identifier for the image; hash_id: a unique identifier for the image based on its content (e.g., image hash); titles: a list of titles associated with the image, with each title having the following key-value pairs: id: a unique identifier for the title; name: the name of the title; creators: a list of creators associated with the image, with each creator having the following key-value pairs: id: a unique identifier for the creator; name: the name of the creator; location: the location associated with the image; institution: the institution that holds the image; source: information about the source of the image, with the following key-value pairs: id: a unique identifier for the source; name: the name of the source; url: the URL of the source; tags: a list of tags associated with the image, with each tag having the following key-value pairs: id: a unique identifier for the tag; name: the name of the tag; language: the language of the tag (if available); count: the number of times the tag has been applied to the image; path: the path to the image file. media.zip: The images themselves are stored in a .zip file. Each image is stored in a folder named after the first two characters of its hash_id. Within this folder, there is a sub-folder named after the next two characters of the hash_id. The image file itself is stored within that sub-folder and is named with the complete hash_id and .jpg file extension. The folder structure within the .zip file thus is as follows: root | ├── f4 | └── 22 | └── f42236be6580338e9b98b8e00c0f4e49.jpg ├── 4c | └── d3 | └── 4cd3f476b14abfcb2a91e6c8f2d356f6.jpg └── ... Terms of use The data are provided “as is,” without any warranties of any kind. They are provided under the Creative Commons Attribution-ShareAlike 4.0 International license, and are updated monthly, so users can be confident they are accessing the most up-to-date information.
ARTigo(https://www.artigo.org/)是一项公民科学(Citizen Science)项目,自2010年起由慕尼黑路德维希-马克西米利安大学艺术史研究所与信息学研究所联合开发。本项目允许参与者为艺术品标注标签,以此推动艺术知识积累,并打破传统艺术领域的精英化壁垒,实现领域准入的民主化普及。ARTigo 被搭建为交互式网页应用,内置“有目的的游戏(Games With a Purpose)”模式:参与者将获得一幅艺术品图像,随后需在限定时间内通过视觉或文本注释与其他参与者协作互动。通过这种寓教于乐的方式,项目旨在提升公众对艺术的鉴赏能力,吸引更多受众关注博物馆与档案馆藏品。本项目不仅优化了艺术史图像的可检索性,同时推动了包容性建设、高效沟通与协作式研究实践。项目数据集面向全球科研社区免费开放,支持开展创新性科学研究。
### 数据文件结构
本数据集以 .jsonl 文件格式提供,每行对应单幅图像及其关联元数据;图像文件本身则单独打包为 .zip 压缩包。
#### data.jsonl
该 .jsonl 文件的每行均对应单幅图像及其元数据,包含以下键值对:
- `id`:图像的唯一标识符;
- `hash_id`:基于图像内容生成的唯一哈希标识符(例如图像哈希值);
- `titles`:关联于该图像的标题列表,每个标题包含以下键值对:
- `id`:标题的唯一标识符;
- `name`:标题名称;
- `creators`:关联于该图像的创作者列表,每个创作者包含以下键值对:
- `id`:创作者的唯一标识符;
- `name`:创作者姓名;
- `location`:图像关联的收藏地点;
- `institution`:持有该图像的收藏机构;
- `source`:图像来源信息,包含以下键值对:
- `id`:来源的唯一标识符;
- `name`:来源名称;
- `url`:来源的URL地址;
- `tags`:关联于该图像的标签列表,每个标签包含以下键值对:
- `id`:标签的唯一标识符;
- `name`:标签名称;
- `language`:标签所使用的语言(若可用);
- `count`:该标签应用于该图像的总次数;
- `path`:图像文件的存储路径。
#### media.zip
所有图像文件均存储于该 .zip 压缩包内。图像文件的存储路径规则为:以其 `hash_id` 的前两位字符作为一级文件夹名,再以下两位字符作为二级文件夹名,最终图像文件以完整的 `hash_id` 命名并附加 .jpg 扩展名存储于该二级文件夹中。压缩包内的文件夹结构示例如下:
根目录
├── f4
│ └── 22
│ └── f42236be6580338e9b98b8e00c0f4e49.jpg
├── 4c
│ └── d3
│ └── 4cd3f476b14abfcb2a91e6c8f2d356f6.jpg
└── ...
#### 使用条款
本数据集按“现状”提供,不附带任何形式的明示或默示担保。数据集采用知识共享署名-相同方式共享4.0国际许可协议(Creative Commons Attribution-ShareAlike 4.0 International)进行授权,且每月更新一次,用户可确保获取的始终为最新版本的数据。
创建时间:
2023-06-28



