five

SbNet Signboard Detection and Classification Dataset

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://doi.org/10.7910/DVN/UOT0RE
下载链接
链接失效反馈
官方服务:
资源简介:
Phase 1: Signboard Detection Dataset This phase focuses on detecting signboards in street images. - **Total Images:** 8,366 - **Image Format:** JPG (8,366 images) - **Resolution:** - Minimum: (720, 443) - Maximum: (9,280, 8,285) - Mean: (4,202, 3,138) - Median: (4,032, 3,024) - **Aspect Ratio:** - Minimum: 0.5625 - Maximum: 5.7043 - Mean: 1.3691 - Most Frequent: 1.3333 - Standard Deviation: 0.2329 - **File Size (KB):** - Minimum: 88.19 KB - Maximum: 41,266.50 KB - Mean: 5,796.19 KB - Total Dataset Size: 48,490,924.91 KB - **Color Statistics:** - Color Mode: RGB (8,366 images) - Mean Color (RGB): (110.32, 112.77, 118.16) - Standard Deviation (RGB): (65.71, 65.36, 65.82) - **Brightness:** - Average: 114.10 --- Phase 2: Region of Text Interest (RTI) Detection Dataset This phase focuses on detecting specific text regions (names and addresses) within signboards. - **Total Images:** 8,036 - **Image Format:** JPG (8,036 images) - **Resolution:** - Minimum: (552, 156) - Maximum: (9,228, 4,682) - Mean: (2,753, 808) - Median: (2,741, 781) - **Aspect Ratio:** - Minimum: 0.9615 - Maximum: 11.3835 - Mean: 3.6058 - Most Frequent: 4.0 - Standard Deviation: 1.2475 - **File Size (KB):** - Minimum: 40.54 KB - Maximum: 7,968.94 KB - Mean: 653.67 KB - Total Dataset Size: 5,252,868.26 KB - **Color Statistics:** - Color Mode: RGB (8,036 images) - Mean Color (RGB): (137.58, 136.29, 144.00) - Standard Deviation (RGB): (47.26, 49.73, 50.89) - **Brightness:** - Average: 138.74 --- Named Entity Recognition (NER) Dataset This dataset is used for categorizing extracted text from signboards. - **Total Entries:** 42,547 - **Unique Categories:** 10 - **Category Distribution:** - Religious Sites: 10,641 - Retail Outlets: 8,275 - Educational Institutions: 6,826 - Healthcare Institutions: 4,708 - Restaurants: 3,868 - Pharmacies: 3,637 - Parks: 1,547 - Banks: 1,121 - Stations: 1,094 - Hotels: 830 #### **Word Count Statistics:** - **Overall Word Count:** - Maximum: 18 - Minimum: 1 - Mean: 3.82 - **Category-Wise Word Count:** - **Banks:** Mean: 4.65, Max: 11, Min: 1 - **Educational Institutions:** Mean: 4.60, Max: 18, Min: 1 - **Healthcare Institutions:** Mean: 4.02, Max: 16, Min: 1 - **Religious Sites:** Mean: 4.36, Max: 17, Min: 1 - **Retail Outlets:** Mean: 3.08, Max: 15, Min: 1 - **Restaurants:** Mean: 3.36, Max: 13, Min: 1 - **Pharmacies:** Mean: 2.91, Max: 13, Min: 1 - **Parks:** Mean: 3.10, Max: 11, Min: 1 - **Stations:** Mean: 3.72, Max: 17, Min: 1 - **Hotels:** Mean: 3.12, Max: 12, Min: 1 This dataset is structured for a two-phase object detection pipeline with an additional text classification task to categorize extracted text from detected regions.
创建时间:
2025-04-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作