five

Data and Code for: From Unstructured Injury Narratives to Structured Risk Pathways

收藏
DataCite Commons2026-05-01 更新2026-05-04 收录
下载链接:
https://data.mendeley.com/datasets/t2sks7fw5j
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset provides supporting data and code for the manuscript “From Unstructured Injury Narratives to Structured Risk Pathways: Revealing Recurrent Work–Hazard–Accident Mechanisms.” The study uses publicly available OSHA Severe Injury Reports from the U.S. Department of Labor as the original data source. The original OSHA data are not redistributed in full in this repository. Instead, this dataset provides materials that support the understanding and partial reproduction of the data-processing, annotation, structuring, and evaluation workflow reported in the manuscript. The repository includes Jupyter notebooks for converting Doccano annotations into NER training data, training the NER model, profiling the graph-based accident representation, comparing text-based and graph-based scenario similarity, and generating representative subgraph visualizations. It also includes the label schema, example normalization rules, graph schema description, sample processed data, and summary evaluation results. The full derived dataset generated in this study, including complete NER outputs, normalized graph-ready tables, and scenario-similarity results, is not publicly deposited because it contains author-generated annotations, intermediate modeling outputs, and research-specific normalization rules that are part of an ongoing research program. The full processed dataset may be made available from the corresponding author upon reasonable request, subject to institutional approval and research-use conditions. Original data source: OSHA Severe Injury Reports, U.S. Department of Labor.
提供机构:
Mendeley Data
创建时间:
2026-05-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作