five

NetherlandsForensicInstitute/vuurwerkverkenner-training-data

收藏
Hugging Face2024-05-21 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/NetherlandsForensicInstitute/vuurwerkverkenner-training-data
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: eupl-1.1 language: - nl --- # NFI Fireworks training dataset for the "Vuurwerkverkenner" application The Netherlands Forensic Institute (NFI) Fireworks training dataset consists of scans of fireworks wrappers from fireworks that were investigated in casework in the Netherlands between 2010 and early 2024. For a subset of these wrappers, photographs of fireworks snippets (pieces of the wrapper after the detonation) were added. ## Data overview The data consists of 331 wrappers in 184 categories. There are 37 categories with more than one wrapper, and snippets are available for 38 wrapper categories. ### Data collection For the fireworks wrappers, all wrappers physically available at the NFI Explosives team from casework between 2010-2024 were scanned. For the collection of the fireworks snippets, fireworks were exploded and snippets that resulted were gathered by hand. Snippets were cleaned, dried and sorted by wrapper. Snippets were then photographed on a white background. One snippet photo contains multiple snippets. Sampling was done based on availability of fireworks in the data collection period. To diversify and increase the dataset, additional wrappers were chosen to print and glue to Cobra 6 fireworks. This selection was done in collaboration with domain experts. ## Data structure Each folder (e.g. fireworks_0) is a firework category. Firework types have been grouped if wrappers were the same or very similar. For instance, variants of the same fireworks from different years, such as a Cobra 6 from 2018 and 2019, were considered very similar. Wrappers were categorised in collaboration with domain experts. Each folder contains at least one wrapper. Additionally, a folder can contain multiple snippet photos (snippets_0.jpg, snippets_1.jpg etc.) if they were available for that wrapper category. The data structure visualised: ``` └─── fireworks_0 └───wrapper_0.jpg └───wrapper_1.jpg └───wrapper_2.jpg └───snippets_0.jpg └───snippets_1.jpg └─── fireworks_1 └─── wrapper_0.jpg └─── fireworks_2 └─── wrapper_0.jpg └─── wrapper_1.jpg └─── fireworks_2 └─── wrapper_0.jpg └─── snippets_0.jpg ``` ## Firework wrappers Wrappers have been scanned and the images have been reduced in size to a maximum of 2000 pixels in either height or width. ### Categorisation Categories with more than one wrapper: | Category | Number of wrappers | | --------------- | ------------------ | | fireworks_2 | 2 | | fireworks_6 | 2 | | fireworks_12 | 3 | | fireworks_13 | 2 | | fireworks_17 | 2 | | fireworks_24 | 2 | | fireworks_25 | 2 | | fireworks_30 | 3 | | fireworks_32 | 2 | | fireworks_33 | 2 | | fireworks_37 | 3 | | fireworks_46 | 3 | | fireworks_51 | 5 | | fireworks_53 | 2 | | fireworks_56 | 3 | | fireworks_59 | 3 | | fireworks_61 | 58 | | fireworks_62 | 9 | | fireworks_63 | 2 | | fireworks_64 | 3 | | fireworks_74 | 2 | | fireworks_78 | 3 | | fireworks_82 | 4 | | fireworks_92 | 2 | | fireworks_97 | 2 | | fireworks_111 | 2 | | fireworks_117 | 2 | | fireworks_135 | 2 | | fireworks_139 | 4 | | fireworks_144 | 8 | | fireworks_145 | 11 | | fireworks_151 | 12 | | fireworks_153 | 4 | | fireworks_154 | 4 | | fireworks_158 | 3 | | fireworks_166 | 2 | | fireworks_179 | 4 | Categories with snippet collections: | Category | Number of snippet collections | | --------------- | ------------------------------ | | fireworks_8 | 3 | | fireworks_12 | 4 | | fireworks_17 | 6 | | fireworks_30 | 1 | | fireworks_37 | 1 | | fireworks_38 | 3 | | fireworks_47 | 1 | | fireworks_54 | 6 | | fireworks_58 | 5 | | fireworks_59 | 6 | | fireworks_61 | 14 | | fireworks_62 | 1 | | fireworks_71 | 3 | | fireworks_75 | 2 | | fireworks_82 | 2 | | fireworks_86 | 6 | | fireworks_87 | 2 | | fireworks_95 | 1 | | fireworks_97 | 2 | | fireworks_105 | 1 | | fireworks_106 | 1 | | fireworks_116 | 4 | | fireworks_119 | 1 | | fireworks_129 | 1 | | fireworks_137 | 1 | | fireworks_139 | 5 | | fireworks_144 | 5 | | fireworks_145 | 16 | | fireworks_148 | 2 | | fireworks_153 | 11 | | fireworks_154 | 2 | | fireworks_158 | 3 | | fireworks_160 | 1 | | fireworks_169 | 3 | | fireworks_174 | 1 | | fireworks_176 | 2 | | fireworks_179 | 1 | | fireworks_182 | 3 | ## Training The steps followed at the NFI to actually train the model with these data, are described in the model card of the Vuurwerkverkenner at https://huggingface.co/NetherlandsForensicInstitute/vuurwerkverkenner (also linked to this data set). ### Possible bias The reference set of firework wrappers is dependent on which fireworks were investigated by the NFI. This may not be representative for fireworks in other countries or in the future. Data collection of snippets was influenced by the availability of fireworks. Due to low availability, some snippets are the result of wrappers glued to Cobra 6 fireworks instead of the original fireworks. We noticed that these wrappers tended to detach more easily than wrappers on original firework. This may have influenced the snippet creation. Very small snippets were not included in our dataset.
提供机构:
NetherlandsForensicInstitute
原始信息汇总

NFI Fireworks Training Dataset Summary

数据集概述

数据内容

  • 总览: 包含331个烟花包装纸,分为184个类别。
  • 包装纸: 所有可用的包装纸均从2010年至2024年的案件中扫描得到。
  • 片段: 对于38个包装纸类别,提供了烟花爆炸后的片段照片。

数据收集

  • 包装纸: 所有物理可用的包装纸均被扫描。
  • 片段: 烟花被引爆后,手工收集片段,清洁、干燥并按包装纸分类,然后在白色背景上拍照。

数据结构

  • 文件夹结构: 每个文件夹代表一个烟花类别,包含至少一个包装纸。如果该类别有片段,则文件夹中还会包含片段照片。
  • 分类: 包装纸根据相似性进行分组,由领域专家协助分类。

包装纸详情

  • 多包装纸类别: 共有23个类别包含多个包装纸。
  • 片段可用类别: 共有34个类别提供了片段照片。

数据处理

  • 图像处理: 包装纸图像被缩放到最大2000像素的高度或宽度。

数据集用途

  • 训练模型: 用于训练“Vuurwerkverkenner”应用程序的模型,具体训练步骤描述在模型卡中。

潜在偏差

  • 代表性: 数据集可能不完全代表其他国家或未来的烟花情况。
  • 片段收集: 片段的收集受烟花可用性影响,部分片段来自非原始烟花。
  • 片段大小: 非常小的片段未被包含在内。
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作