five

dacl1k|桥梁损伤检测数据集|图像分类数据集

收藏
arXiv2023-09-07 更新2024-06-21 收录
桥梁损伤检测
图像分类
下载链接:
https://github.com/phiyodr/building-inspection-toolkit
下载链接
链接失效反馈
资源简介:
dacl1k是一个专注于实际桥梁损伤检测的多标签分类数据集,由联邦国防军大学创建。该数据集包含1,474张图像,这些图像来源于实际的建筑检查,涉及多种损伤类型,如裂缝、风化、剥落、暴露钢筋和锈蚀等。dacl1k数据集的创建过程涉及专业工程师的标注,确保了数据的高质量和准确性。该数据集主要用于评估和改进桥梁损伤识别模型,特别是在实际应用场景中的性能。
提供机构:
联邦国防军大学
创建时间:
2023-09-07
AI搜集汇总
数据集介绍
main_image_url
构建方式
dacl1k数据集的构建基于真实的桥梁建筑检查,其中包括了1,474张图片。为了确保数据集的多样性和实用性,研究人员从不同桥梁中收集了这些图片,并且这些图片是在不同的相机角度、光照条件和硬件环境下拍摄的。此外,dacl1k数据集还包含了由土木工程师根据德国检查标准进行标注的缺陷信息。这些标注涵盖了五种常见的桥梁缺陷类型:裂缝、泛霜、剥落、露筋和锈蚀,以及无缺陷的标签。为了进一步验证模型的实际应用能力,研究人员还构建了多个元数据集,这些元数据集是由四个开源的钢筋混凝土缺陷数据集组合而成的。
使用方法
使用dacl1k数据集的方法包括数据集的下载、模型的训练和评估。首先,用户可以从GitHub上下载dacl1k数据集和元数据集。然后,用户可以使用预训练的模型或者从头开始训练的模型来进行钢筋混凝土缺陷识别。在训练过程中,用户可以选择不同的迁移学习方法和数据增强技术来提高模型的性能。最后,用户可以使用数据集中的测试集来评估模型的性能,包括精确匹配率(Exact Match Ratio)和类内召回率(Classwise Recall)。此外,研究人员还提供了基于瓶颈特征分析的内部评估方法,以进一步探究模型性能的原因。
背景与挑战
背景概述
在桥梁结构安全评估中,识别钢筋混凝土缺陷(RCDs)对于确定结构的完整性、交通安全和耐久性至关重要。现有的RCD领域数据集大多来源于少数桥梁,这些桥梁在特定的相机姿态、光照条件和固定硬件下采集图像,这限制了这些数据集在现实世界场景中的应用。为了解决这个问题,研究人员创建了dacl1k数据集,这是一个基于建筑检查的多标签分类RCD数据集,包括1,474张图像。该数据集的创建时间为2023年,主要研究人员来自德国慕尼黑联邦国防军大学的结构工程研究所和分布式智能系统研究所。该数据集的研究目标是评估在开放源数据上训练的模型在现实世界场景中的实用性,并探索如何最好地利用现有的开放源知识。dacl1k数据集对相关领域的影响力体现在,它提供了第一个来自现实世界检查的RCD多标签数据集,有助于推动RCD识别技术的发展,并为研究人员和从业者提供了在真实世界测试模型的机会。
当前挑战
dacl1k数据集的研究背景和相关挑战主要包括:1)现实世界中的RCD识别问题,需要模型能够处理各种桥梁类型、环境条件和图像质量;2)构建数据集过程中遇到的挑战,如数据采集、标注和清洗等;3)在现实世界数据上训练和评估模型的挑战,包括模型泛化能力、领域迁移和数据不平衡等问题。这些挑战需要通过改进数据集构建方法、采用更先进的模型训练策略和进行更深入的性能分析来解决。
常用场景
经典使用场景
dacl1k数据集主要应用于钢筋混凝土缺陷(RCDs)的识别,这是确定桥梁结构完整性、交通安全和耐久性的关键因素。该数据集包含1,474张来自实际建筑检查的图像,涵盖了多种标签,例如裂缝、白华、剥落、暴露钢筋和锈蚀等。dacl1k数据集的多样性使其成为多标签分类任务的一个高度现实的数据集,可用于测试和训练模型,以识别和分析桥梁中的各种缺陷。
解决学术问题
dacl1k数据集解决了现有RCD领域数据集的局限性问题。现有数据集通常来自特定桥梁,在特定相机姿态、光照条件和固定硬件下获取,限制了其模型在现实世界场景中的可用性。dacl1k数据集提供了高度多样化的数据,包含真实世界桥梁检查的图像,从而可以更好地评估模型的泛化能力。此外,该数据集还通过内部和外部评估,分析了模型学习到的特征和类别的区分能力,为RCD领域的学术研究提供了重要的数据支持。
实际应用
dacl1k数据集的实际应用场景主要包括桥梁结构完整性评估、交通安全管理和耐久性分析。通过使用dacl1k数据集训练的模型,可以对桥梁进行检查,识别和分析各种缺陷,从而为桥梁的维护和修复提供重要的参考依据。此外,dacl1k数据集还可以用于开发桥梁安全监测系统,实现对桥梁状态的实时监测,提高桥梁的安全性。
数据集最近研究
最新研究方向
dacl1k数据集的最新研究方向是针对实际桥梁检测中的钢筋混凝土缺陷(RCDs)识别。该数据集源于真实的桥梁检查,包含了1,474张图像,为多标签分类提供了丰富多样的数据。研究重点在于如何利用开源数据训练的模型在实际场景中表现,以及如何优化训练策略以提高模型的准确性和泛化能力。研究结果表明,尽管开源数据在真实场景中的表现有所欠缺,但通过改进的训练策略和开源数据与实际数据的结合,模型性能得到了显著提升。此外,研究还发现模型主要学会了区分数据集而非缺陷类型,这为未来的研究提供了新的方向。dacl1k数据集和训练模型将公开发布,为研究人员和实践者提供了一个测试模型实际应用的平台。
相关研究论文
  • 1
    dacl1k: Real-World Bridge Damage Dataset Putting Open-Source Data to the Test联邦国防军大学 · 2023年
以上内容由AI搜集并总结生成
用户留言
有没有相关的论文或文献参考?
这个数据集是基于什么背景创建的?
数据集的作者是谁?
能帮我联系到这个数据集的作者吗?
这个数据集如何下载?
点击留言
数据主题
具身智能
数据集  4098个
机构  8个
大模型
数据集  439个
机构  10个
无人机
数据集  37个
机构  6个
指令微调
数据集  36个
机构  6个
蛋白质结构
数据集  50个
机构  8个
空间智能
数据集  21个
机构  5个
5,000+
优质数据集
54 个
任务类型
进入经典数据集
热门数据集

Canadian Census

**Overview** The data package provides demographics for Canadian population groups according to multiple location categories: Forward Sortation Areas (FSAs), Census Metropolitan Areas (CMAs) and Census Agglomerations (CAs), Federal Electoral Districts (FEDs), Health Regions (HRs) and provinces. **Description** The data are available through the Canadian Census and the National Household Survey (NHS), separated or combined. The main demographic indicators provided for the population groups, stratified not only by location but also for the majority by demographical and socioeconomic characteristics, are population number, females and males, usual residents and private dwellings. The primary use of the data at the Health Region level is for health surveillance and population health research. Federal and provincial departments of health and human resources, social service agencies, and other types of government agencies use the information to monitor, plan, implement and evaluate programs to improve the health of Canadians and the efficiency of health services. Researchers from various fields use the information to conduct research to improve health. Non-profit health organizations and the media use the health region data to raise awareness about health, an issue of concern to all Canadians. The Census population counts for a particular geographic area representing the number of Canadians whose usual place of residence is in that area, regardless of where they happened to be on Census Day. Also included are any Canadians who were staying in that area on Census Day and who had no usual place of residence elsewhere in Canada, as well as those considered to be 'non-permanent residents'. National Household Survey (NHS) provides demographic data for various levels of geography, including provinces and territories, census metropolitan areas/census agglomerations, census divisions, census subdivisions, census tracts, federal electoral districts and health regions. In order to provide a comprehensive overview of an area, this product presents data from both the NHS and the Census. NHS data topics include immigration and ethnocultural diversity; aboriginal peoples; education and labor; mobility and migration; language of work; income and housing. 2011 Census data topics include population and dwelling counts; age and sex; families, households and marital status; structural type of dwelling and collectives; and language. The data are collected for private dwellings occupied by usual residents. A private dwelling is a dwelling in which a person or a group of persons permanently reside. Information for the National Household Survey does not include information for collective dwellings. Collective dwellings are dwellings used for commercial, institutional or communal purposes, such as a hotel, a hospital or a work camp. **Benefits** - Useful for canada public health stakeholders, for public health specialist or specialized public and other interested parties. for health surveillance and population health research. for monitoring, planning, implementation and evaluation of health-related programs. media agencies may use the health regions data to raise awareness about health, an issue of concern to all canadians. giving the addition of longitude and latitude in some of the datasets the data can be useful to transpose the values into geographical representations. the fields descriptions along with the dataset description are useful for the user to quickly understand the data and the dataset. **License Information** The use of John Snow Labs datasets is free for personal and research purposes. For commercial use please subscribe to the [Data Library](https://www.johnsnowlabs.com/marketplace/) on John Snow Labs website. The subscription will allow you to use all John Snow Labs datasets and data packages for commercial purposes. **Included Datasets** - [Canadian Population and Dwelling by FSA 2011](https://www.johnsnowlabs.com/marketplace/canadian-population-and-dwelling-by-fsa-2011) - This Canadian Census dataset covers data on population, total private dwellings and private dwellings occupied by usual residents by forward sortation area (FSA). It is enriched with the percentage of the population or dwellings versus the total amount as well as the geographical area, province, and latitude and longitude. The whole Canada's population is marked as 100, referring to 100% for the percentages. - [Detailed Canadian Population Statistics by CMAs and CAs 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-cmas-and-cas-2011) - This dataset covers the population statistics of Canada by Census Metropolitan Areas (CMAs) and Census Agglomerations (CAs). It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by FED 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-fed-2011) - This dataset covers the population statistics of Canada from 2011 by Federal Electoral District of 2013 Representation Order. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by Health Region 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-health-region-2011) - This dataset covers the population statistics of Canada by health region. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by Province 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-province-2011) - This dataset covers the population statistics of Canada by provinces and territories. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. **Data Engineering Overview** **We deliver high-quality data** - Each dataset goes through 3 levels of quality review - 2 Manual reviews are done by domain experts - Then, an automated set of 60+ validations enforces every datum matches metadata & defined constraints - Data is normalized into one unified type system - All dates, unites, codes, currencies look the same - All null values are normalized to the same value - All dataset and field names are SQL and Hive compliant - Data and Metadata - Data is available in both CSV and Apache Parquet format, optimized for high read performance on distributed Hadoop, Spark & MPP clusters - Metadata is provided in the open Frictionless Data standard, and its every field is normalized & validated - Data Updates - Data updates support replace-on-update: outdated foreign keys are deprecated, not deleted **Our data is curated and enriched by domain experts** Each dataset is manually curated by our team of doctors, pharmacists, public health & medical billing experts: - Field names, descriptions, and normalized values are chosen by people who actually understand their meaning - Healthcare & life science experts add categories, search keywords, descriptions and more to each dataset - Both manual and automated data enrichment supported for clinical codes, providers, drugs, and geo-locations - The data is always kept up to date – even when the source requires manual effort to get updates - Support for data subscribers is provided directly by the domain experts who curated the data sets - Every data source’s license is manually verified to allow for royalty-free commercial use and redistribution. **Need Help?** If you have questions about our products, contact us at [info@johnsnowlabs.com](mailto:info@johnsnowlabs.com).

Databricks 收录

Population and Housing Census of 2007 - Ethiopia

Geographic coverage --------------------------- National coverage Analysis unit --------------------------- Household Person Housing unit Universe --------------------------- The census has counted people on dejure and defacto basis. The dejure population comprises all the persons who belong to a given area at a given time by virtue of usual residence, while under defacto approach people were counted as the residents of the place where they found. In the census, a person is said to be a usual resident of a household (and hence an area) if he/she has been residing in the household continuously for at least six months before the census day or intends to reside in the household for six months or longer. Thus, visitors are not included with the usual (dejure) population. Homeless persons were enumerated in the place where they spent the night on the enumeration day. The 2007 census counted foreign nationals who were residing in the city administration. On the other hand all Ethiopians living abroad were not counted. Kind of data --------------------------- Census/enumeration data [cen] Mode of data collection --------------------------- Face-to-face [f2f] Research instrument --------------------------- Two type sof questionnaires were used to collect census data: i) Short questionnaire ii) Long questionnaire Unlike the previous censuses, the contents of the short and long questionnaires were similar both for the urban and rural areas as well as for the entire city. But the short and the long questionnaires differ by the number of variables they contained. That is, the short questionnaire was used to collect basic data on population characteristics, such as population size, sex, age, language, ethnic group, religion, orphanhood and disability. Whereas the long questionnaire includes information on marital status, education, economic activity, migration, fertility, mortality, as well as housing stocks and conditions in addition to those questions contained in a short questionnaire.

catalog.ihsn.org 收录

VoxBox

VoxBox是一个大规模语音语料库,由多样化的开源数据集构建而成,用于训练文本到语音(TTS)系统。

github 收录

轴承故障数据集

本项目集成了多个公开的轴承故障数据集,所有数据均被处理为1秒/个的数据样本,并使用fft得到其频域特征。支持通过数据集、通道、故障、严重程度对所有样本进行筛选,并选择时域或频域显示。

github 收录

HazyDet

HazyDet是由解放军工程大学等机构创建的一个大规模数据集,专门用于雾霾场景下的无人机视角物体检测。该数据集包含383,000个真实世界实例,收集自自然雾霾环境和正常场景中人工添加的雾霾效果,以模拟恶劣天气条件。数据集的创建过程结合了深度估计和大气散射模型,确保了数据的真实性和多样性。HazyDet主要应用于无人机在恶劣天气条件下的物体检测,旨在提高无人机在复杂环境中的感知能力。

arXiv 收录