Data from: Automated taxonomic identification of insects with expert-level accuracy using effective feature transfer from convolutional networks
收藏DataCite Commons2025-06-01 更新2025-04-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.20ch6p5
下载链接
链接失效反馈官方服务:
资源简介:
Rapid and reliable identification of insects is important in many
contexts, from the detection of disease vectors and invasive species to
the sorting of material from biodiversity inventories. Because of the
shortage of adequate expertise, there has long been an interest in
developing automated systems for this task. Previous attempts have been
based on laborious and complex handcrafted extraction of image features,
but in recent years it has been shown that sophisticated convolutional
neural networks (CNNs) can learn to extract relevant features
automatically, without human intervention. Unfortunately, reaching
expert-level accuracy in CNN identifications requires substantial
computational power and huge training datasets, which are often not
available for taxonomic tasks. This can be addressed using feature
transfer: a CNN that has been pretrained on a generic image classification
task is exposed to the taxonomic images of interest, and information about
its perception of those images is used in training a simpler, dedicated
identification system. Here, we develop an effective method of CNN feature
transfer, which achieves expert-level accuracy in taxonomic identification
of insects with training sets of 100 images or less per category.
Specifically, we extract rich representations of intermediate to
high-level image features from the CNN architecture VGG16 pretrained on
the ImageNet dataset. This information is fed into a linear support vector
machine classifier, which is trained on the target problem. We tested the
performance of our approach on two types of challenging taxonomic tasks:
(1) identifying insects to higher groups when they are likely to belong to
subgroups that have not been seen previously; and (2) identifying visually
similar species that are difficult to separate even for experts. For the
first task, our approach reaches > 92 % accuracy on one dataset
(884 face images of 11 families of Diptera, all specimens representing
unique species), and > 96 % accuracy on another (2936 dorsal
habitus images of 14 families of Coleoptera, over 90 % of specimens
belonging to unique species). For the second task, our approach
outperforms a leading taxonomic expert on one dataset (339 images of three
species of the Coleoptera genus Oxythyrea; 97 % accuracy), and both humans
and traditional automated identification systems on another dataset (3845
images of nine species of Plecoptera larvae; 98.6 % accuracy). Reanalyzing
several biological image identification tasks studied in the recent
literature, we show that our approach is broadly applicable and provides
significant improvements over previous methods, whether based on dedicated
CNNs, CNN feature transfer, or more traditional techniques. Thus, our
method, which is easy to apply, can be highly successful in developing
automated taxonomic identification systems even when training datasets are
small and computational budgets limited.
提供机构:
Dryad
创建时间:
2019-03-04



