five

claritystorm/osha-workplace-injuries

收藏
Hugging Face2026-03-31 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/claritystorm/osha-workplace-injuries
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: other license_name: public-domain task_categories: - tabular-classification - tabular-regression tags: - workplace-safety - osha - insurance - esg - compliance - united-states pretty_name: OSHA Workplace Injuries & Illness 2016-2023 size_categories: - 1M<n<10M --- # OSHA Workplace Injuries & Illness 2016–2023 Every establishment-level workplace injury and illness record submitted to the OSHA Injury Tracking Application (ITA) since 2016. **2.38 million records** across 8 survey years covering 600K+ unique establishments — with NAICS codes, DART/TCIR rates, and detailed injury/illness breakdowns. The only public dataset linking individual establishment safety performance to industry benchmarks. | 📊 Records | 📅 Coverage | 🏷️ License | 🔄 Updated | |-----------|-------------|-----------|-----------| | 2.38M+ records | 2016–2023 (8 years) | Public Domain | Annual | **This repo contains a free 1,000-row sample.** Full dataset (CSV + Parquet) → **[claritystorm.com/datasets/osha-injuries](https://claritystorm.com/datasets/osha-injuries)** --- ## Quick Start ```python from datasets import load_dataset import pandas as pd # Load the 1,000-row sample ds = load_dataset("claritystorm/osha-workplace-injuries") df = ds["train"].to_pandas() # Total workplace deaths by year print(df.groupby("survey_year")["total_deaths"].sum()) # Industries with highest average DART rate print(df.groupby("industry_description")["dart_rate"].mean() .sort_values(ascending=False).head(10)) # Establishments with zero injuries (benchmark group) safe_pct = df["no_injuries_illnesses"].mean() * 100 print(f"Establishments with zero injuries: {safe_pct:.1f}%") # Size class distribution print(df["size_class"].value_counts()) ``` ## Use Cases - **Workplace safety risk scoring** — predict DART/TCIR rates from NAICS code, size class, and historical performance - **ESG & responsible investing** — screen company supply chains and subsidiaries for OSHA safety performance - **Insurance underwriting** — establishment-level injury rates for workers' compensation risk modeling - **OSHA compliance benchmarking** — compare an establishment's safety record to industry averages - **Industry safety trend analysis** — 8-year panel data tracks safety improvements and deterioration by sector - **Human capital ML** — injury/illness rates as a feature for company quality and labor conditions scoring ## Schema (selected fields) | Field | Type | Description | |-------|------|-------------| | survey_year | int | OSHA reporting year (2016–2023) | | estab_name | string | Establishment name | | company_name | string | Parent company name | | state | string | US state (2-letter code) | | naics_code | string | 6-digit NAICS code | | industry_description | string | Industry description | | size_class | string | Establishment size class | | annual_average_employees | int | Annual average employee count | | total_hours_worked | int | Total hours worked | | total_deaths | int | Total workplace deaths | | total_dafw_cases | int | Days Away From Work cases | | total_injuries | int | Total injuries | | total_resp_conditions | int | Respiratory condition cases | | dart_rate | float | DART rate per 100 FTE (derived) | | tcir_rate | float | TCIR rate per 100 FTE (derived) | ## ⬇️ Get the Full Dataset | Tier | Price | Includes | |------|-------|----------| | Sample | Free | 1,000 rows, Public Domain (this repo) | | Complete | $79 | Full 2.38M+ rows, CSV + Parquet, commercial license | | Annual | $149/yr | Complete + annual updates | 👉 **[Purchase at claritystorm.com/datasets/osha-injuries](https://claritystorm.com/datasets/osha-injuries)** ## Source US Occupational Safety and Health Administration (OSHA), Injury Tracking Application (ITA). OSHA ITA data is a US federal government work in the **public domain** (17 U.S.C. 105). Processed and structured by [ClarityStorm Data](https://claritystorm.com).
提供机构:
claritystorm
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作