ClarusC64/clinical-quad-infection-buffer-lag-coupling-sepsis-transition-v1.2

Name: ClarusC64/clinical-quad-infection-buffer-lag-coupling-sepsis-transition-v1.2
Creator: ClarusC64
Published: 2026-03-23 12:57:01
License: 暂无描述

Hugging Face2026-03-23 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/ClarusC64/clinical-quad-infection-buffer-lag-coupling-sepsis-transition-v1.2

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: en license: mit task_categories: - text-classification tags: - clinical-trials - quad-coupling - temporal-policy-stability - clarus-v1.2 - sepsis-transition size_categories: - 1K<n<10K pretty_name: Clinical Quad Infection Buffer Lag Coupling Sepsis Transition v1.2 --- # Clinical Quad Infection Buffer Lag Coupling Sepsis Transition v1.2 ## What this repo does This repository contains a Clarus v1.2 benchmark dataset. The v1.2 layer introduces **Temporal Policy Stability Geometry**. Earlier versions evaluate: - system state - trajectory - intervention selection - closed-loop control - counterfactual policy quality v1.2 extends the framework to evaluate **policy handling across time**. The benchmark asks: - does the chosen policy remain correct as conditions evolve - when does a previously correct policy become unsafe - does the controller know when to maintain the current policy - does the controller know when to switch strategy This is not only a durability benchmark. It is a benchmark for **temporally correct policy handling**. ## Core quad The system is defined by four interacting variables. - infection_load - buffer_capacity - lag_burden - coupling_stress These variables define the state geometry of the system. All higher-level signals describe how this state changes under pressure, intervention, drift, and feedback. ## Clinical variable mapping | Quad Variable | Clinical Measurements | Typical Indicators | |---|---|---| | infection_load | infectious burden, source severity, microbial pressure | source persistence, bacteremia load, uncontrolled infection | | buffer_capacity | physiological reserve, perfusion reserve, immune reserve | remaining compensation, reserve to tolerate septic stress | | lag_burden | delayed correction load, unresolved instability debt | late antibiotics, delayed source control, untreated progression | | coupling_stress | cross-system destabilization linking infection, perfusion, and organs | vasoplegia, inflammatory spillover, organ strain | ## Prediction target `label_sepsis_transition_temporal_policy` Binary classification. - `1` = temporally correct policy handling - `0` = temporally incorrect policy handling A positive label includes two valid cases: - the original policy remained correct over time and was correctly maintained - the original policy became invalid and the controller correctly identified that policy reselection was required ## Label logic A positive label means the controller handled temporal policy evolution correctly. That includes: ### Stable-policy case The original policy remains valid over time. Typical properties: - stabilization is real - trajectory improves materially - control remains aligned - policy drift sensitivity stays low - policy half-life stays high - no reselection trigger is required ### Correct-switch case The original policy degrades over time. Typical properties: - drift sensitivity rises - policy half-life falls - fragility increases - a reselection trigger should fire - the controller correctly identifies that the policy must change This means v1.2 rewards both: - correct policy retention - correct switch detection ## What v1.2 adds Earlier versions ask: - where the system is - where it is moving - which intervention is best - whether that intervention remains robust under alternatives v1.2 adds a temporal question: - does the controller know when to stay with the current policy - and does it know when to switch This introduces: - policy durability - policy decay - reselection timing - temporal control correctness The shift is precise. From: - choosing the correct policy To: - handling policy evolution correctly over time ## New v1.2 temporal policy signals v1.2 introduces signals describing policy validity under drift. | Signal | Meaning | |---|---| | `policy_drift_sensitivity` | how rapidly the chosen policy becomes invalid as conditions change | | `adaptive_policy_margin` | tolerance before the policy must be changed | | `stabilization_half_life` | duration for which the chosen policy remains effective | | `policy_fragility_score` | sensitivity of the policy to small temporal or state perturbations | | `policy_reselection_trigger` | binary indicator that the controller should switch policy | | `policy_decay_rate` | rate at which policy effectiveness declines over time | | `temporal_control_consistency` | whether control remains coherent across successive temporal states | | `reselection_delay_cost` | cost incurred by failing to switch when reselection is needed | These signals define **Temporal Policy Stability Geometry**. ## Example scenario A representative v1.2 switch case: - the initial intervention is correct at first - the septic state begins to drift - policy effectiveness decays - a reselection trigger should fire - the controller must switch rather than remain on the original path In v1.2, this can still be a positive case. The controller is rewarded not only for holding a stable policy, but also for correctly detecting when the policy must change. ### Example numeric row pattern | Signal | Value | |---|---| | `control_sequence_alignment_score` | 0.64 | | `policy_drift_sensitivity` | 0.65 | | `stabilization_half_life` | 0.36 | | `policy_fragility_score` | 0.60 | | `policy_reselection_trigger` | 1 | | `temporal_control_consistency` | 0.41 | | `reselection_delay_cost` | 0.28 | | `label_sepsis_transition_temporal_policy` | 1 | Interpretation: - the original policy does not remain valid - the system correctly signals that policy reselection is needed - the controller is rewarded for correct switch detection ## Row structure Each row includes: - core system variables - trajectory signals - boundary geometry - regime transition signals - intervention competition signals - control sequence signals - temporal policy signals - perturbation and recovery signals - quad delta signals - final outcome fields This allows a model to evaluate not only whether a policy works, but whether it continues to work as the system changes. ## Files - `data/train.csv` Full training dataset with labels and all temporal policy signals. - `data/tester.csv` Evaluation-style dataset with: - `stabilization_success` - `label_sepsis_transition_temporal_policy` removed. - `scorer.py` Reference scorer computing binary metrics, temporal policy diagnostics, temporal miss diagnostics, and control diagnostics. - `benchmark_spec.json` Machine-readable benchmark specification. - `dataset_schema.json` Machine-readable schema with types, ranges, and row order. - `README.md` This document. ## Evaluation Primary metric: - `recall_temporally_correct_policy_handling` This measures how often the model handles temporal policy evolution correctly. That includes both: - keeping a policy that remains valid - switching when the policy becomes invalid Secondary metric: - `false_temporal_handling_rate` This measures how often the model predicts temporally correct handling when the policy handling was actually wrong. Binary metrics: - accuracy - precision - recall - f1 - confusion matrix Temporal policy diagnostics: - `temporal_policy_path_accuracy` - `policy_drift_sensitivity_error` - `adaptive_policy_margin_error` - `stabilization_half_life_error` - `policy_fragility_score_error` - `policy_reselection_trigger_accuracy` - `policy_decay_rate_error` - `temporal_control_consistency_error` - `reselection_delay_cost_error` Temporal miss diagnostics: - `high_uncertainty_temporal_miss_rate` - `narrow_window_temporal_miss_rate` Control diagnostics: - `control_sequence_alignment_accuracy` - `control_horizon_error` - `feedback_response_accuracy` - `intervention_timing_accuracy` - `adaptation_latency_error` - `control_stability_error` - `recovery_consistency_error` - `recalibration_overuse_rate` - `controller_oscillation_misread_rate` - `terminal_pathway_state_accuracy` ## Dataset construction The dataset is generated using structured temporal scenarios. Typical generation steps: 1. construct an initial system state from the quad variables 2. simulate one or more intervention policies 3. evolve the system through temporal drift 4. measure whether the original policy remains valid 5. identify whether and when reselection becomes necessary 6. compute temporal policy signals 7. assign the benchmark label based on correctness of temporal policy handling Temporal scenarios are designed to capture two key classes: ### Policy retention cases The original policy remains valid across the relevant horizon. ### Policy switch cases The original policy becomes invalid and reselection is required. This means v1.2 is designed to test the controller’s capacity for temporal judgment, not just policy endurance. ## Running the scorer ```text python scorer.py data/train.csv predictions.csv python scorer.py data/train.csv predictions.csv --verbose Dataset limitations This dataset evaluates structural temporal policy handling, not exact real-world treatment precision. Important limits: temporal drift is represented structurally rather than as a full mechanistic simulation policy reselection logic is generated at dataset construction time temporal signals represent control quality and decay, not literal bedside protocols domain abstractions may omit real-world noise, staffing effects, or intervention delays outside the modeled geometry These datasets should be treated as benchmarks for temporal control reasoning. Intended use This dataset is intended for: control policy evaluation temporal decision stability testing reinforcement learning benchmarking adaptive controller benchmarking model robustness research under drift This dataset is not intended for: direct clinical decision making diagnosis treatment recommendation deployment without external validation use as a sole decision system Structural note v1.2 marks the move from: choosing the correct policy to: handling policy evolution correctly across time That means v1.2 evaluates both: stable-policy retention correct switch detection This makes v1.2 the first Clarus layer that formally tests whether a controller knows: when to stay when to switch Position in the Clarus ladder v0.1 — detection v0.2 — trajectory v0.3 — cascade forecasting v0.4 — boundary discovery v0.5 — recovery geometry v0.6 — intervention reasoning v0.7 — uncertainty geometry v0.8 — regime transition v0.9 — intervention competition v1.0 — closed-loop control v1.1 — counterfactual and adversarial policy testing v1.2 — temporal policy stability Production deployment This dataset format is suitable for controlled benchmarking in domains where policy validity changes over time. Examples include: ICU stabilization under evolving state sepsis management under delayed source control infectious escalation under shifting instability adaptive control in distributed systems sequential intervention planning in high-risk environments Enterprise and research collaboration Clarus evaluates stability and control in complex systems. The goal is not simply to predict what happens next. The goal is to determine: whether the chosen action remains valid whether the controller can track changing conditions whether the system knows when the correct move is to switch v1.2 extends Clarus from control correctness into temporal control correctness. License MIT

提供机构：

ClarusC64

5,000+

优质数据集

54 个

任务类型

进入经典数据集