ClarusC64/clinical-quad-oxygen-demand-buffer-lag-coupling-respiratory-collapse-v1.2
收藏Hugging Face2026-03-23 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/ClarusC64/clinical-quad-oxygen-demand-buffer-lag-coupling-respiratory-collapse-v1.2
下载链接
链接失效反馈官方服务:
资源简介:
---
language: en
license: mit
task_categories:
- text-classification
tags:
- clinical-trials
- quad-coupling
- temporal-policy-stability
- clarus-v1.2
- respiratory-collapse
size_categories:
- 1K<n<10K
pretty_name: Clinical Quad Oxygen Demand Buffer Lag Coupling Respiratory Collapse v1.2
---
# Clinical Quad Oxygen Demand Buffer Lag Coupling Respiratory Collapse v1.2
## What this repo does
This repository contains a Clarus v1.2 benchmark dataset.
The v1.2 layer introduces **Temporal Policy Stability Geometry**.
Earlier versions evaluate:
- system state
- trajectory
- intervention selection
- closed-loop control
- counterfactual policy quality
v1.2 extends the framework to evaluate **policy handling across time**.
The benchmark asks:
- does the chosen policy remain correct as conditions evolve
- when does a previously correct policy become unsafe
- does the controller know when to maintain the current policy
- does the controller know when to switch strategy
This is not only a durability benchmark.
It is a benchmark for **temporally correct policy handling**.
## Core quad
The system is defined by four interacting variables.
- oxygen_demand
- buffer_capacity
- lag_burden
- coupling_stress
These variables define the state geometry of the system.
All higher-level signals describe how this state changes under pressure, intervention, drift, and feedback.
## Clinical variable mapping
| Quad Variable | Clinical Measurements | Typical Indicators |
|---|---|---|
| oxygen_demand | oxygen requirement, FiO2 burden, work of breathing pressure | escalating oxygen need, desaturation pressure, rising ventilatory demand |
| buffer_capacity | physiological reserve, lung reserve, hemodynamic tolerance | preserved reserve, compensatory space, tolerance to escalation |
| lag_burden | delayed correction load, unresolved instability debt | delayed intervention, prolonged hypoxemia, accumulated respiratory debt |
| coupling_stress | cross-system destabilization linking lungs, perfusion, and downstream organs | V/Q mismatch spillover, hemodynamic stress, organ strain |
## Prediction target
`label_respiratory_collapse_temporal_policy`
Binary classification.
- `1` = temporally correct policy handling
- `0` = temporally incorrect policy handling
A positive label includes two valid cases:
- the original policy remained correct over time and was correctly maintained
- the original policy became invalid and the controller correctly identified that policy reselection was required
## Label logic
A positive label means the controller handled temporal policy evolution correctly.
That includes:
### Stable-policy case
The original policy remains valid over time.
Typical properties:
- stabilization is real
- trajectory improves materially
- control remains aligned
- policy drift sensitivity stays low
- policy half-life stays high
- no reselection trigger is required
### Correct-switch case
The original policy degrades over time.
Typical properties:
- drift sensitivity rises
- policy half-life falls
- fragility increases
- a reselection trigger should fire
- the controller correctly identifies that the policy must change
This means v1.2 rewards both:
- correct policy retention
- correct switch detection
## What v1.2 adds
Earlier versions ask:
- where the system is
- where it is moving
- which intervention is best
- whether that intervention remains robust under alternatives
v1.2 adds a temporal question:
- does the controller know when to stay with the current policy
- and does it know when to switch
This introduces:
- policy durability
- policy decay
- reselection timing
- temporal control correctness
The shift is precise.
From:
- choosing the correct policy
To:
- handling policy evolution correctly over time
## New v1.2 temporal policy signals
v1.2 introduces signals describing policy validity under drift.
| Signal | Meaning |
|---|---|
| `policy_drift_sensitivity` | how rapidly the chosen policy becomes invalid as conditions change |
| `adaptive_policy_margin` | tolerance before the policy must be changed |
| `stabilization_half_life` | duration for which the chosen policy remains effective |
| `policy_fragility_score` | sensitivity of the policy to small temporal or state perturbations |
| `policy_reselection_trigger` | binary indicator that the controller should switch policy |
| `policy_decay_rate` | rate at which policy effectiveness declines over time |
| `temporal_control_consistency` | whether control remains coherent across successive temporal states |
| `reselection_delay_cost` | cost incurred by failing to switch when reselection is needed |
These signals define **Temporal Policy Stability Geometry**.
## Example scenario
A representative v1.2 switch case:
- the initial intervention is correct at first
- the system begins to drift
- policy effectiveness decays
- a reselection trigger should fire
- the controller must switch rather than remain on the original path
In v1.2, this can still be a positive case.
The controller is rewarded not only for holding a stable policy,
but also for correctly detecting when the policy must change.
### Example numeric row pattern
| Signal | Value |
|---|---|
| `control_sequence_alignment_score` | 0.66 |
| `policy_drift_sensitivity` | 0.63 |
| `stabilization_half_life` | 0.38 |
| `policy_fragility_score` | 0.58 |
| `policy_reselection_trigger` | 1 |
| `temporal_control_consistency` | 0.43 |
| `reselection_delay_cost` | 0.27 |
| `label_respiratory_collapse_temporal_policy` | 1 |
Interpretation:
- the original policy does not remain valid
- the system correctly signals that policy reselection is needed
- the controller is rewarded for correct switch detection
## Row structure
Each row includes:
- core system variables
- trajectory signals
- boundary geometry
- regime transition signals
- intervention competition signals
- control sequence signals
- temporal policy signals
- perturbation and recovery signals
- quad delta signals
- final outcome fields
This allows a model to evaluate not only whether a policy works,
but whether it continues to work as the system changes.
## Files
- `data/train.csv`
Full training dataset with labels and all temporal policy signals.
- `data/tester.csv`
Evaluation-style dataset with:
- `stabilization_success`
- `label_respiratory_collapse_temporal_policy`
removed.
- `scorer.py`
Reference scorer computing binary metrics, temporal policy diagnostics, temporal miss diagnostics, and control diagnostics.
- `benchmark_spec.json`
Machine-readable benchmark specification.
- `dataset_schema.json`
Machine-readable schema with types, ranges, and row order.
- `README.md`
This document.
## Evaluation
Primary metric:
- `recall_temporally_correct_policy_handling`
This measures how often the model handles temporal policy evolution correctly.
That includes both:
- keeping a policy that remains valid
- switching when the policy becomes invalid
Secondary metric:
- `false_temporal_handling_rate`
This measures how often the model predicts temporally correct handling when the policy handling was actually wrong.
Binary metrics:
- accuracy
- precision
- recall
- f1
- confusion matrix
Temporal policy diagnostics:
- `temporal_policy_path_accuracy`
- `policy_drift_sensitivity_error`
- `adaptive_policy_margin_error`
- `stabilization_half_life_error`
- `policy_fragility_score_error`
- `policy_reselection_trigger_accuracy`
- `policy_decay_rate_error`
- `temporal_control_consistency_error`
- `reselection_delay_cost_error`
Temporal miss diagnostics:
- `high_uncertainty_temporal_miss_rate`
- `narrow_window_temporal_miss_rate`
Control diagnostics:
- `control_sequence_alignment_accuracy`
- `control_horizon_error`
- `feedback_response_accuracy`
- `intervention_timing_accuracy`
- `adaptation_latency_error`
- `control_stability_error`
- `recovery_consistency_error`
- `recalibration_overuse_rate`
- `controller_oscillation_misread_rate`
- `terminal_pathway_state_accuracy`
## Dataset construction
The dataset is generated using structured temporal scenarios.
Typical generation steps:
1. construct an initial system state from the quad variables
2. simulate one or more intervention policies
3. evolve the system through temporal drift
4. measure whether the original policy remains valid
5. identify whether and when reselection becomes necessary
6. compute temporal policy signals
7. assign the benchmark label based on correctness of temporal policy handling
Temporal scenarios are designed to capture two key classes:
### Policy retention cases
The original policy remains valid across the relevant horizon.
### Policy switch cases
The original policy becomes invalid and reselection is required.
This means v1.2 is designed to test the controller’s capacity for temporal judgment,
not just policy endurance.
## Running the scorer
```text
python scorer.py data/train.csv predictions.csv
python scorer.py data/train.csv predictions.csv --verbose
Dataset limitations
This dataset evaluates structural temporal policy handling, not exact real-world treatment precision.
Important limits:
temporal drift is represented structurally rather than as a full mechanistic simulation
policy reselection logic is generated at dataset construction time
temporal signals represent control quality and decay, not literal bedside protocols
domain abstractions may omit real-world noise, staffing effects, or intervention delays outside the modeled geometry
These datasets should be treated as benchmarks for temporal control reasoning.
Intended use
This dataset is intended for:
control policy evaluation
temporal decision stability testing
reinforcement learning benchmarking
adaptive controller benchmarking
model robustness research under drift
This dataset is not intended for:
direct clinical decision making
diagnosis
treatment recommendation
deployment without external validation
use as a sole decision system
Structural note
v1.2 marks the move from:
choosing the correct policy
to:
handling policy evolution correctly across time
That means v1.2 evaluates both:
stable-policy retention
correct switch detection
This makes v1.2 the first Clarus layer that formally tests whether a controller knows:
when to stay
when to switch
Position in the Clarus ladder
v0.1 — detection
v0.2 — trajectory
v0.3 — cascade forecasting
v0.4 — boundary discovery
v0.5 — recovery geometry
v0.6 — intervention reasoning
v0.7 — uncertainty geometry
v0.8 — regime transition
v0.9 — intervention competition
v1.0 — closed-loop control
v1.1 — counterfactual and adversarial policy testing
v1.2 — temporal policy stability
Production deployment
This dataset format is suitable for controlled benchmarking in domains where policy validity changes over time.
Examples include:
ICU stabilization under evolving state
respiratory escalation under delayed response
sepsis management under shifting instability
adaptive control in distributed systems
sequential intervention planning in high-risk environments
Enterprise and research collaboration
Clarus evaluates stability and control in complex systems.
The goal is not simply to predict what happens next.
The goal is to determine:
whether the chosen action remains valid
whether the controller can track changing conditions
whether the system knows when the correct move is to switch
v1.2 extends Clarus from control correctness into temporal control correctness.
License
MIT
提供机构:
ClarusC64



