Digital Humanities Multi-Task Image Classification Dataset (DHMTIC)
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8069424
下载链接
链接失效反馈官方服务:
资源简介:
This dataset was assembled to train a neural network to undertake a multi-task, multi-class classification challenge, as well as to facilitate parameterized image searches for research queries in the humanities. This is significant for several reasons. To illustrate, large quantities of textual and visual data can be processed more efficiently through specially trained neural networks. Additionally, parameterized image searches permit a detailed examination and analysis of visual data, enabling tasks such as identifying image copies or tracking the evolution and reuse of image motifs. The images hail from diverse research fields, including art and architecture, design, and life sciences. Each image has been classified for two tasks: media type classification and content type classification. The content type, which refers to the subject depicted in the image, is categorized into 14 classes, whereas the media type, denoting whether the image is a graphic, photograph, or drawing, is divided into 3 classes. The class distribution is skewed, with a single class housing a large number of entries and the remaining classes having fewer. For example, in the content type, the "object" class contains the bulk of the data, whereas classes like "musical notation" and "map" form the tail, each with fewer than 100 examples in the training and validation subsets. Similarly, in the media type, the "graphic" subset contains the majority of samples, with the "photography" and "drawing" subsets containing considerably fewer.
创建时间:
2023-06-28



