Layered-Labs/neiss-injury-data
收藏Hugging Face2026-03-27 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Layered-Labs/neiss-injury-data
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: other
pretty_name: "NEISS Injury Data"
tags:
- neiss
- injury
- emergency-medicine
- epidemiology
- public-health
- cpsc
- tabular
task_categories:
- other
size_categories:
- 1M<n<10M
---
<h1 align="center">NEISS Injury Data</h1>
<h3 align="center">7.3 million emergency department injury records from 2005 to 2024, consolidated into a single query-ready Parquet file.</h3>
<p align="center">
<a href="https://huggingface.co/datasets/Layered-Labs/neiss-injury-data">
<img src="https://huggingface.co/datasets/huggingface/badges/resolve/main/dataset-on-hf-md.svg" alt="Dataset on HuggingFace"/>
</a>
<a href="https://github.com/Layered-Labs/neiss-injury-data">
<img src="https://img.shields.io/badge/GitHub-black?logo=github" alt="GitHub"/>
</a>
<img src="https://img.shields.io/badge/license-OTHER-blue.svg" alt="License"/>
</p>
<p align="center">
<img src="https://raw.githubusercontent.com/Layered-Labs/assets/main/neiss.svg" alt="NEISS Injury Data" width="100%"/>
</p>
---
## Overview
NEISS Injury Data is the most complete publicly available consolidation of National Electronic
Injury Surveillance System records, covering 7,326,429 emergency department visits associated
with consumer products, sports, and activities from 2005 through 2024. Each record is one ED
visit drawn from the CPSC-operated probability sample of approximately 100 US hospital emergency
departments. A year column added during consolidation enables cross-year queries on the full
20-year dataset without joining separate annual files. All records are de-identified and publicly
available from the US Consumer Product Safety Commission.
---
## Statement of Need
Injury surveillance research typically requires downloading and joining 20 separate NEISS annual
files, each with slightly different column encodings and file formats. This dataset consolidates
all available NEISS records from 2005 through 2024 into a single Parquet file, enabling
cross-year queries on the full 20-year record without custom ETL. Researchers studying injury
trends, product safety, sports medicine, or emergency department utilization can load the full
dataset in a single call and apply NEISS sampling weights to produce national estimates.
---
## Intended Use
This dataset is intended for researchers, epidemiologists, and public health analysts studying
consumer product injury trends, sports and recreational injury patterns, emergency department
utilization, and population-level injury burden. It is suitable for academic research, policy
analysis, and data journalism. It is not intended for individual-level clinical decision support
or any application that treats these aggregate surveillance records as individual patient data.
---
## Limitations
NEISS is a probability sample of approximately 100 US hospital emergency departments and is
designed to produce national estimates via the Weight column, not to enumerate individual cases.
Raw row counts do not represent actual injury counts. Narrative fields are free text and may
contain inconsistent formatting. Diagnosis and product codes require the official NEISS coding
manual for interpretation. The dataset covers ED visits only and does not include injuries
treated in other settings or fatalities without an ED visit. Race and ethnicity coding has
changed across years and should be interpreted with caution in longitudinal analyses.
---
## Dataset Structure
### Splits
| Split | Rows |
|-------|------|
| train | 7,326,429 |
### Features
| Column | Type | Description |
|--------|------|-------------|
| `CPSC_Case_Number` | `int64` | Unique case identifier assigned by CPSC. |
| `Treatment_Date` | `timestamp` | Date of the emergency department visit. |
| `Age` | `int16` | Patient age in years. |
| `Sex` | `int8` | 1=Male, 2=Female. |
| `Body_Part` | `int16` | Primary body part injured (NEISS code). |
| `Diagnosis` | `int16` | Primary diagnosis code (e.g., 47=Fracture). |
| `Disposition` | `int8` | Case outcome code (1=Treated/Released, 4=Admitted). |
| `Product_1` | `int32` | Primary product or activity associated with the injury (NEISS code). |
| `Narrative_1` | `string` | Free-text narrative describing the injury circumstances. |
| `Weight` | `float32` | Sample weight for computing national injury estimates. |
| `year` | `int16` | Calendar year of the ED visit (added during consolidation). |
---
## Usage
```python
from datasets import load_dataset
ds = load_dataset("Layered-Labs/neiss-injury-data")
print(ds)
```
### Example
```python
import pandas as pd
df = pd.read_parquet("hf://datasets/Layered-Labs/neiss-injury-data/neiss_all.parquet")
# All pickleball injuries, 2015-2024
picks = df[(df["Product_1"] == 3235) & (df["year"] >= 2015)]
# National estimate: sum sample weights
national_estimate = picks["Weight"].sum()
print(f"Estimated national pickleball injuries 2015-2024: {national_estimate:,.0f}")
```
---
## Citation
If you use this dataset in your research, please cite:
```bibtex
@dataset{layeredlabs2025neiss,
title = {NEISS Injury Data 2005-2024},
author = {{Layered Labs}},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/datasets/Layered-Labs/neiss-injury-data},
note = {Consolidated from US Consumer Product Safety Commission NEISS annual files}
}
```
---
## License
Released under the [OTHER License](LICENSE).
Maintained by [Layered Labs](https://layeredlabs.ai).
提供机构:
Layered-Labs



