five

PlutonicRocks-13: An Class-imbalanced Image Dataset of Plutonic Rocks

收藏
DataCite Commons2026-03-26 更新2026-05-05 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=090f1c9e55494dd084c45ed74a7d978f
下载链接
链接失效反馈
官方服务:
资源简介:
Lithology recognition is one of the fundamental skills for geologists. Due to the rise of artificial intelligence (AI), a fundamental challenge and opportunity in geosciences lies in translating expert geological knowledge into AI models capable of delivering intelligent lithological recognition services, enabling geoscience enthusiasts or non-geologists to accurately identify rock types. In natural environments, the spatial distribution of surface rocks is highly heterogeneous, resulting in rock image datasets that typically follow a long-tailed distribution. Taking plutonic rocks as an example, this study adopts the classification and nomenclature scheme from the textbook Petrology (edited by Yu Bingsong et al.), to presents PlutonicRocks-13, an imbalanced dataset for rock image recognition. The dataset includes 13 common types of plutonic rocks, with a total of 4,785 images and an original data size of 2.49 GB. The rock types included in this dataset are: olivine, pyroxenite, hornblendite, gabbro, diorite, monzonite, syenite, nepheline syenite, granodiorite, monzogranite, syenogranite, plagiogranite and graphic granite. Rock images were primarily collected from field outcrops and hand specimens from museums, supplemented by online sources. After careful screening, processing, and annotation, these images were curated into PlutonicRocks-13, a dataset tailored for rock image classification. To ensure annotation quality, quality control and evaluation procedures were implemented, including thin-section petrographic verification and bias detection based on explainable deep learning techniques. Furthermore, by converting annotated labels into question-answer pairs, this dataset can be used for instruction tuning of multimodal models, enabling them to perform rock image classification through natural language instructions. This image dataset provides reliable data support for research on automated rock image recognition and holds significant reference value for geological surveys, surficial substrate investigations, and public geoscience education.
提供机构:
Science Data Bank
创建时间:
2025-12-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作