five

Digital Humanities Multi-Task Image Classification Dataset (DHMTIC)

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8069424
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset was assembled to train a neural network to undertake a multi-task, multi-class classification challenge, as well as to facilitate parameterized image searches for research queries in the humanities. This is significant for several reasons. To illustrate, large quantities of textual and visual data can be processed more efficiently through specially trained neural networks. Additionally, parameterized image searches permit a detailed examination and analysis of visual data, enabling tasks such as identifying image copies or tracking the evolution and reuse of image motifs. The images hail from diverse research fields, including art and architecture, design, and life sciences. Each image has been classified for two tasks: media type classification and content type classification. The content type, which refers to the subject depicted in the image, is categorized into 14 classes, whereas the media type, denoting whether the image is a graphic, photograph, or drawing, is divided into 3 classes. The class distribution is skewed, with a single class housing a large number of entries and the remaining classes having fewer. For example, in the content type, the "object" class contains the bulk of the data, whereas classes like "musical notation" and "map" form the tail, each with fewer than 100 examples in the training and validation subsets. Similarly, in the media type, the "graphic" subset contains the majority of samples, with the "photography" and "drawing" subsets containing considerably fewer.
创建时间:
2023-06-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作