five

QSBench/QSBench-Core-v1.0.0-demo

收藏
Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/QSBench/QSBench-Core-v1.0.0-demo
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-4.0 task_categories: - tabular-regression - feature-extraction language: - en tags: - qiskit - quantum-circuits - synthetic-dataset - benchmark - expectation-values - quantum-computing - qml-benchmark - quantum dataset - qml dataset - quantum benchmark - quantum circuits dataset - expectation value prediction - variational quantum circuits - hybrid quantum classical pretty_name: QSBench Core Demo v1.0.0 – Quantum Machine Learning Dataset (Quantum Circuits, Expectation Values, n=6) size_categories: - n<1K --- ![QSBench Logo](https://i.imgur.com/VyLgYtf.png) 🌐 [Website](https://qsbench.github.io) | 🤗 [Dataset](https://huggingface.co/datasets/QSBench/QSBench-Core-v1.0.0-demo) | 🛠️ [GitHub](https://github.com/QSBench/QSBench-Core-v1.0.0-demo) | 🚀 [Interactive Demo](https://huggingface.co/QSBench/spaces) # QSBench Core Demo v1.0.0 **Quantum Machine Learning dataset for regression on expectation values.** Includes quantum circuits, QASM, and structured features for training ML models. Keywords: quantum dataset, QML benchmark, quantum circuits dataset, expectation value prediction. **2000 high-quality synthetic quantum circuits** — clean simulation demo of the QSBench family. Designed for researchers and engineers working on Quantum Machine Learning, variational algorithms, and hybrid quantum-classical models. ### Why QSBench? Most public quantum datasets are too small, poorly documented, or lack paired ideal/noisy data. QSBench solves this by providing **reproducible, richly annotated, and ready-to-use** datasets. ### Use Cases - Training Quantum Machine Learning models - Benchmarking noise robustness - Predicting expectation values from circuit structure - Hybrid quantum-classical ML pipelines - Feature engineering from quantum circuits ### Dataset Overview - **Samples**: 2000 - **Qubits**: 6 - **Depth**: 4 - **Circuit Families**: Mixed (HEA, RealAmplitudes, QFT, Efficient SU(2), Random) - **Entanglement**: Full - **Noise**: None (clean simulation) - **Observables**: Z, X, Y in mixed mode (global + per‑qubit) - **Shots**: 512 - **Splits**: Train (157) / Validation (26) / Test (17) — deterministic hash‑based ### What's Inside Each Sample Each sample in the Parquet files contains: - Raw and transpiled QASM representations - Circuit adjacency matrix - Detailed gate statistics (single‑qubit, two‑qubit, CX, H, RX, RY, RZ) - Structural metrics: Gate entropy + Meyer‑Wallach entanglement - Ideal expectation values for Z, X, Y (global and per‑qubit) - Circuit family label and full generation metadata - Deterministic split label (train/val/test) ### QSBench-Core: Quantum Circuit Complexity **You don't need a PhD in Quantum Physics to use this dataset.** If you are a Data Scientist, ML Engineer, or AI Researcher, think of a quantum circuit as a **Computational Graph (DAG)** or a piece of **Code**. This dataset provides the raw structural blueprints of thousands of quantum algorithms. ### The ML Mission: Unsupervised Learning & Clustering Since this dataset contains clean, ideal circuits (no noise), it is perfect for **Unsupervised Learning**. Can you cluster these circuits into distinct "complexity classes" using K-Means or HDBSCAN? Can you build a Graph Neural Network (GNN) that learns the topology of these circuits? ### Dataset Anatomy (Features) Think of these columns as your `X` features. | Group | Column Name | What is it for ML? | | :--- | :--- | :--- | | **Meta** | `circuit_hash`, `split` | Unique IDs and train/test splits. | | **Topology** | `adjacency` | The graph structure! A matrix showing how nodes (qubits) are connected. Perfect for GNNs. | | **Code** | `qasm_raw` | The raw text of the algorithm. Great for NLP/LLM tasks. | | **Complexity** | `depth`, `gate_entropy` | Tabular features indicating how "deep" and "random" the graph is. | | **Weights** | `total_gates`, `cx_count` | Node/Edge counts. `cx_count` is the number of complex interactions. | ### Quick Start Idea Try to run **PCA** on the numeric features (`depth`, `gate_entropy`, `cx_count`, `adj_density`) to visualize the "DNA" of quantum algorithms in 2D space. ### Load the Dataset The dataset is stored in Parquet format inside the `data/shards/` folder. You can load it directly using the Hugging Face `datasets` library: ```python from datasets import load_dataset # Load the demo dataset (free) dataset = load_dataset("QSBench/QSBench-Core-v1.0.0-demo", split="train") # Inspect the first sample print(dataset[0]) ``` If you prefer to use `pandas`: ```python import pandas as pd # Load all Parquet shards from the data folder df = pd.read_parquet("data/shards/*.parquet") print(df.head()) ``` ### Example: Train a simple model on expectation values ```python from sklearn.ensemble import RandomForestRegressor import numpy as np from datasets import load_dataset # Load dataset ds = load_dataset("QSBench/QSBench-Core-v1.0.0-demo") # Use gate count as a simple feature X_train = np.array([s["total_gates"] for s in ds["train"]]).reshape(-1, 1) y_train = np.array([s["ideal_expval_Z_global"] for s in ds["train"]]) model = RandomForestRegressor(random_state=42) model.fit(X_train, y_train) # Evaluate on test set X_test = np.array([s["total_gates"] for s in ds["test"]]).reshape(-1, 1) y_test = np.array([s["ideal_expval_Z_global"] for s in ds["test"]]) score = model.score(X_test, y_test) print(f"R² score: {score:.4f}") ``` For more advanced usage (e.g., using QASM strings, adjacency matrices), check the provided metadata files in the `meta/` folder. ### Repository Structure The dataset is stored in the `main` branch and contains only the data files to ensure the Dataset Viewer works correctly: ``` QSBench-Core-v1.0.0-demo/ ├── README.md # This file └── data/ # Parquet shards (main data) └── shards/ └── *.parquet └── *.csv ``` All metadata files (coverage.json, schema.json, meta.json, data_card.md, etc.) are located in a separate branch called **`metadata`** to avoid interfering with the Dataset Viewer. You can browse them here: 👉 [metadata branch](https://huggingface.co/datasets/QSBench/QSBench-Core-v1.0.0-demo/tree/metadata) ### Related QSBench Datasets - QSBench Lite (20k samples, n=4) - QSBench Core (75k samples, n=8) - Depolarizing Noise Pack (150k samples) - Amplitude Damping Pack (150k samples) - Transpilation Hardware Pack (200k samples) ### Part of the QSBench Family This is a small public **demo version**. Full‑scale datasets (20k–150k+ samples), noisy versions (Depolarizing, Amplitude Damping), and custom datasets are available. [Repository](https://github.com/QSBench/QSBench-Core-v1.0.0-demo) [Website & Full Catalog](https://qsbench.github.io) **License**: CC BY‑NC 4.0 (Personal & Research Use) **Questions or custom requests?** Visit our [website](https://qsbench.github.io) or open an issue on [GitHub](https://github.com/QSBench/QSBench-Core-v1.0.0-demo), or inspect the generation pipeline in the **QSBench Generator** [repository](https://github.com/QSBench/QSBench-Generator). ### Support QSBench You can support the project directly on this Giveth page: **[https://giveth.io/project/qsbench](https://giveth.io/project/qsbench)** Your donations help us generate larger datasets, cover GPU costs, and continue developing new realistic noise models. --- *Generated with QSBench Generator v5.0.2*
提供机构:
QSBench
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作