five

algoplexity/computational-phase-transitions-data

收藏
Hugging Face2025-11-20 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/algoplexity/computational-phase-transitions-data
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - time-series-forecasting - tabular-classification - other pretty_name: Financial Structural Breaks & Regime Detection Benchmark tags: - finance - econophysics - algorithmic-information-theory - structural-breaks - time-series - anomaly-detection size_categories: - 1M<n<10M --- # Financial Structural Breaks & Regime Detection Benchmark **Maintainer:** [Algoplexity](https://github.com/algoplexity) **Primary Repositories:** 1. **The Coherence Meter:** [GitHub Repo](https://github.com/algoplexity/Coherence-Meter) (Horizon 0) 2. **The Computational Phase Transition:** [GitHub Repo](https://github.com/algoplexity/computational-phase-transitions) (Horizon 1) ## 1. Overview This repository serves as the **immutable data artifact** for the Algoplexity research program into **Algorithmic Information Dynamics (AID)** in financial markets. It contains a large-scale collection of non-stationary, continuous financial time series, specifically curated to benchmark methods for **Structural Break Detection** and **Market Regime Diagnosis**. This data underpins the validation of two distinct methodologies: * **The Coherence Meter:** A statistical, falsification-driven framework comparing "Stethoscope" (univariate) vs. "Microscope" (multivariate) approaches. * **The AIT Physicist:** A transformer-based diagnostic tool that maps market dynamics to **Wolfram Complexity Classes** (e.g., Rule 54 vs. Rule 60) to detect "Computational Phase Transitions." ## 2. Dataset Utility This dataset allows researchers to reproduce key findings from the associated papers, including: * The **"Cost of Complexity"** curve (MDL analysis). * The **-27.07% Early Warning** signal in algorithmic entropy. * The distinct topological signatures of **Systemic** vs. **Exogenous** crashes. ## 3. Dataset Structure The data is stored in highly compressed **Parquet** format, optimized for scientific computing and cloud-based ingestion. ### Files * **`X_train.parquet`**: The primary feature set containing thousands of continuous financial time series. * **`y_train.parquet`**: The ground-truth labels indicating the precise timestamp of structural breaks. * **`X_test.parquet` / 'y_test.parquet'**: Out-of-sample series (derived from the Falcon forecasting challenge) used for generalization testing. ### Schema **Features (`X_train.parquet`)**: * `id` (string): Unique identifier for the time series. * `period` (int): Sequential time step. * `value` (float): The continuous signal (price/return). **Labels (`y_train.parquet`)**: * `id` (string): Unique identifier. * `structural_breakpoint` (int): The time step where the regime shift formally occurs. * `label` (int): Class identifier (0 = No Break, 1 = Break). ## 4. Provenance * **Source:** Derived from the **CrunchDAO** research competitions (Structural Break & Falcon). * **Preprocessing:** Data has been anonymized, standardized, and formatted for both statistical analysis (rolling variance) and algorithmic encoding (quantile binning). ## 5. Universal Loading (Python) This dataset is designed to be ingested directly from the cloud, removing dependencies on local storage or Google Drive. ```python from huggingface_hub import hf_hub_download import pandas as pd def load_benchmark_data(filename): """ Fetches data from the Algoplexity Benchmark Repository. Uses local caching for offline capability. """ repo_id = "algoplexity/computational-phase-transitions-data" print(f"--- Fetching {filename} from Scientific Repository ---") local_path = hf_hub_download( repo_id=repo_id, filename=filename, repo_type="dataset" ) return pd.read_parquet(local_path) # Usage df_features = load_benchmark_data("X_train.parquet") df_labels = load_benchmark_data("y_train.parquet") ``` ## 6. Citation If you use this data in your research, please cite the associated Algoplexity repositories: ```bibtex @misc{ait_physicist_2025, author = {Mak, Yeu Wen}, title = {The Computational Phase Transition: Quantifying the Algorithmic Information Dynamics of Financial Crises}, year = {2025}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/algoplexity/computational-phase-transitions}} } @misc{coherence_meter_2025, author = {Mak, Yeu Wen}, title = {The Coherence Meter: A Hybrid AIT-MDL Framework for Early-Warning Structural Break Detection}, year = {2025}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/algoplexity/Coherence-Meter}} } ```
提供机构:
algoplexity
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作