ClarusC64/clinical-quad-oxygen-demand-buffer-lag-respiratory-lockin-v1.3
收藏Hugging Face2026-03-23 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/ClarusC64/clinical-quad-oxygen-demand-buffer-lag-respiratory-lockin-v1.3
下载链接
链接失效反馈官方服务:
资源简介:
---
language: en
license: mit
task_categories:
- text-classification
tags:
- clinical-trials
- quad-coupling
- failure-reconstruction
- clarus-v1.3
- respiratory-lockin
size_categories:
- 1K<n<10K
pretty_name: Clinical Quad Oxygen Demand Buffer Lag Respiratory Lockin v1.3
---
# Clinical Quad Oxygen Demand Buffer Lag Respiratory Lockin v1.3
Clinical Quad Oxygen Demand Buffer Lag Respiratory Lockin v1.3
What this repo does
This repository contains a Clarus v1.3 benchmark dataset.
The v1.3 layer introduces Failure Reconstruction Geometry.
Earlier Clarus layers evaluate:
• system state
• trajectory
• instability boundaries
• intervention selection
• control sequence correctness
• temporal policy handling
v1.3 adds a new capability.
The benchmark asks whether a model can reconstruct the causal path that produced a failure state.
The goal is not simply predicting collapse.
The goal is understanding how the collapse happened.
This means v1.3 evaluates whether a model can determine:
• which policy error initiated the cascade
• how the failure propagated through the system
• which step in the chain should have been corrected
• which counterfactual intervention would have prevented the lock-in
In practical terms this means the model must recover:
• the failure decision sequence
• the root policy error
• the counterfactual recovery step
Correct reconstruction requires all three.
Core quad
The system is defined by four interacting variables.
• oxygen_demand
• buffer_capacity
• lag_burden
• coupling_stress
These variables define the structural state of the system.
All higher-level signals describe how instability propagates across this quad.
Clinical variable mapping
Quad Variable Clinical Measurements Typical Indicators
oxygen_demand respiratory workload, metabolic demand, oxygen extraction pressure tachypnea, high work of breathing, rising lactate
buffer_capacity respiratory reserve, perfusion reserve, ventilatory margin oxygen reserve, ventilatory tolerance
lag_burden delayed correction load delayed ventilation, late oxygen support
coupling_stress cross-system destabilization cardiac strain, inflammatory amplification
These variables define the respiratory collapse geometry.
Prediction target
label_failure_reconstruction
Binary classification.
• 1 = correct reconstruction of the failure chain
• 0 = incorrect reconstruction
A positive prediction requires that the model correctly identifies:
the failure decision sequence
the root policy error
the counterfactual recovery step
All three must match the gold scenario.
Partial reconstruction is not considered correct.
Label logic
A positive label requires four conditions.
the case is reconstructable
the predicted chain matches the gold causal chain
the root policy error is correctly identified
the correct recovery intervention is identified
Reconstruction quality is evaluated using ordered chain overlap.
Failure reconstruction geometry
Earlier Clarus layers ask questions such as:
• where the system is
• where it is moving
• which intervention stabilizes it
• whether that intervention remains correct over time
v1.3 asks a different question.
Can the model explain why the failure occurred?
This requires reconstructing the causal chain.
Example failure chain:
policy_delay
→ oxygen_supply_deficit
→ respiratory_load_spike
→ ventilatory_compensation_failure
→ respiratory_lockin
The model must recover this ordered sequence.
Sequence ordering matters.
A reversed chain is not considered correct.
Ordered chain evaluation
Failure reconstruction uses Longest Common Subsequence (LCS).
This preserves causal order.
Example:
True chain
policy_delay > oxygen_supply_deficit > ventilatory_failure
Predicted chain
policy_delay > ventilatory_failure
Overlap score = 2 / 3
Set-based overlap is not used because it loses ordering information.
What v1.3 adds
Earlier layers evaluate control correctness.
v1.3 evaluates causal understanding of failure.
The benchmark measures whether a system can:
• reconstruct cascade origins
• identify policy mistakes
• recover the counterfactual intervention
• distinguish early vs late intervention errors
This makes v1.3 the first Clarus layer that evaluates post-failure causal reasoning.
v1.3 reconstruction signals
Signal Meaning
failure_decision_sequence ordered causal chain leading to collapse
failure_path_length number of causal steps
error_propagation_factor strength of policy error amplification
cascade_amplification_factor degree of systemic amplification
recovery_window_width remaining time window for intervention
label_root_policy_error policy step responsible for cascade initiation
label_counterfactual_recovery_step intervention that would have prevented collapse
These signals define Failure Reconstruction Geometry.
Example scenario
Example respiratory lock-in cascade:
Signal Value
oxygen_demand 0.82
buffer_capacity 0.28
lag_burden 0.64
coupling_stress 0.71
failure_decision_sequence policy_delay > oxygen_supply_deficit > ventilatory_failure
failure_path_length 3
cascade_amplification_factor 0.76
recovery_window_width 0.18
label_root_policy_error policy_delay
label_counterfactual_recovery_step early_oxygen_support
label_failure_reconstruction 1
Interpretation:
• delayed intervention triggered the cascade
• oxygen deficit amplified instability
• ventilatory failure locked the system
Correct reconstruction requires identifying this chain.
Row structure
Each row contains:
• quad state variables
• trajectory signals
• boundary geometry
• regime transition signals
• intervention competition signals
• control signals
• temporal policy signals
• failure reconstruction signals
• final outcome labels
This structure allows models to reason about both control decisions and failure causality.
Files
data/train.csv
Training dataset with full signals and labels.
data/tester.csv
Evaluation dataset with outcome labels removed.
scorer.py
Reference scorer implementing ordered failure reconstruction evaluation.
benchmark_spec.json
Machine-readable benchmark specification.
dataset_schema.json
Full schema with column types and signal ranges.
README.md
This document.
Evaluation
Primary metric
failure_reconstruction_accuracy
Measures how often reconstructable failures are correctly reconstructed.
Secondary metric
false_failure_reconstruction_rate
Measures how often a model predicts a correct reconstruction when the chain is wrong.
Binary metrics
• accuracy
• precision
• recall
• f1
• confusion matrix
Failure diagnostics
• high_amplification_chain_miss_rate
• narrow_recovery_window_miss_rate
• early_root_error_miss_rate
These diagnostics reveal structural blind spots in cascade reasoning.
Dataset construction
Dataset scenarios are generated using structured cascade simulations.
Typical generation process:
1 construct a quad state
2 simulate policy decisions
3 propagate failure through system coupling
4 compute cascade amplification
5 determine recovery windows
6 generate counterfactual interventions
7 assign reconstruction labels
Scenarios are designed to include:
• early intervention failures
• delayed response cascades
• amplification-driven collapse
• narrow recovery windows
This produces realistic failure path diversity.
Running the scorer
python scorer.py data/train.csv predictions.csv
python scorer.py data/train.csv predictions.csv --verbose
Dataset limitations
This dataset evaluates structural failure reasoning, not precise clinical physiology.
Important limits:
• cascade signals are structural abstractions
• respiratory dynamics are simplified
• intervention timing is represented geometrically rather than physiologically
• external clinical noise is not modeled
These datasets should be treated as control reasoning benchmarks, not clinical decision systems.
Intended use
This dataset is intended for:
• failure analysis benchmarking
• causal reasoning evaluation
• control system diagnostics
• reinforcement learning failure analysis
• robustness research in safety-critical systems
This dataset is not intended for:
• direct clinical treatment decisions
• medical diagnosis
• deployment without domain validation
• use as a sole decision system
Position in the Clarus ladder
v0.1 — detection
v0.2 — trajectory
v0.3 — cascade forecasting
v0.4 — boundary discovery
v0.5 — recovery geometry
v0.6 — intervention reasoning
v0.7 — uncertainty geometry
v0.8 — regime transition
v0.9 — intervention competition
v1.0 — closed-loop control
v1.1 — counterfactual policy testing
v1.2 — temporal policy stability
v1.3 — failure reconstruction geometry
Structural note
v1.3 marks a structural shift.
Earlier layers evaluate whether the controller made the correct decision.
v1.3 evaluates whether the system can explain how failure occurred.
This introduces a new dimension of evaluation:
causal reconstruction of collapse.
Production deployment
Failure reconstruction geometry is useful in domains where post-incident analysis matters.
Examples include:
• ICU collapse analysis
• safety incident reconstruction
• industrial failure investigation
• aviation accident modeling
• AI system failure diagnostics
Enterprise and research collaboration
Clarus evaluates stability and control in complex systems.
The goal is not simply predicting outcomes.
The goal is determining:
• why failures occur
• whether failures can be reconstructed
• whether the system understands its own collapse dynamics
v1.3 extends Clarus from control correctness to failure causality.
License
MIT
提供机构:
ClarusC64



