five

emergentphysicslab/waveguard-benchmarks

收藏
Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/emergentphysicslab/waveguard-benchmarks
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - tabular-classification tags: - anomaly-detection - time-series - time-series-classification - server-monitoring - cybersecurity - benchmark - physics - waveguard - zero-training - iot - financial-data pretty_name: WaveGuard Anomaly Detection Benchmarks size_categories: - 1K<n<10K --- # WaveGuard Anomaly Detection Benchmarks Curated benchmark datasets for evaluating time-series and tabular anomaly detection models. Each dataset includes labeled training (normal) and test (mixed normal + anomalous) splits. ## Datasets ### 1. Server Metrics (`server_metrics/`) Simulated server health metrics with injected failure events. - **Features**: cpu, memory, disk_io, network, errors (5 numeric) - **Training**: 500 normal samples - **Test**: 100 samples (15 anomalous) - **Anomaly types**: CPU spike, memory leak, disk saturation, network flood ### 2. Crypto Price Anomalies (`crypto_prices/`) Real cryptocurrency OHLCV data (BTC, ETH, SOL) from 2021-2026 with labeled flash crashes and pump events. - **Features**: open, high, low, close, volume (5 numeric per coin) - **Training**: 1200 normal daily candles per coin - **Test**: 600 candles per coin (labeled anomalies at known events) - **Source**: Yahoo Finance via yfinance ### 3. Synthetic Time Series (`synthetic_timeseries/`) Controlled synthetic signals with known anomaly injection points. - **Patterns**: sinusoidal, trend, seasonal, random walk - **Anomaly types**: point (spike), contextual (subtle shift), collective (regime change) - **Training**: 200 clean windows per pattern - **Test**: 50 windows per pattern (10 anomalous each) ## Format Each dataset is provided as Parquet files: ``` dataset_name/ train.parquet # Normal samples only test.parquet # Mixed normal + anomalous metadata.json # Feature descriptions, anomaly counts, creation params ``` ## Usage ```python from datasets import load_dataset ds = load_dataset("gpartin/waveguard-benchmarks", "server_metrics") train = ds["train"].to_pandas() test = ds["test"].to_pandas() ``` ## Evaluation Protocol 1. Train/fit your detector on `train.parquet` only 2. Score each row in `test.parquet` 3. Report: Precision, Recall, F1, AUC-ROC, Average Latency 4. Compare against WaveGuard baseline in the model card ## Citation ```bibtex @dataset{waveguard_benchmarks2025, title={WaveGuard Anomaly Detection Benchmarks}, author={Partin, Greg}, year={2025}, url={https://huggingface.co/datasets/gpartin/waveguard-benchmarks} } ```
提供机构:
emergentphysicslab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作