QSBench/QSBench-Core-v1.0.0-demo
收藏Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/QSBench/QSBench-Core-v1.0.0-demo
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
task_categories:
- tabular-regression
- feature-extraction
language:
- en
tags:
- qiskit
- quantum-circuits
- synthetic-dataset
- benchmark
- expectation-values
- quantum-computing
- qml-benchmark
- quantum dataset
- qml dataset
- quantum benchmark
- quantum circuits dataset
- expectation value prediction
- variational quantum circuits
- hybrid quantum classical
pretty_name: QSBench Core Demo v1.0.0 – Quantum Machine Learning Dataset (Quantum Circuits, Expectation Values, n=6)
size_categories:
- n<1K
---

🌐 [Website](https://qsbench.github.io) | 🤗 [Dataset](https://huggingface.co/datasets/QSBench/QSBench-Core-v1.0.0-demo) | 🛠️ [GitHub](https://github.com/QSBench/QSBench-Core-v1.0.0-demo) | 🚀 [Interactive Demo](https://huggingface.co/QSBench/spaces)
# QSBench Core Demo v1.0.0
**Quantum Machine Learning dataset for regression on expectation values.**
Includes quantum circuits, QASM, and structured features for training ML models.
Keywords: quantum dataset, QML benchmark, quantum circuits dataset, expectation value prediction.
**2000 high-quality synthetic quantum circuits** — clean simulation demo of the QSBench family.
Designed for researchers and engineers working on Quantum Machine Learning, variational algorithms, and hybrid quantum-classical models.
### Why QSBench?
Most public quantum datasets are too small, poorly documented, or lack paired ideal/noisy data. QSBench solves this by providing **reproducible, richly annotated, and ready-to-use** datasets.
### Use Cases
- Training Quantum Machine Learning models
- Benchmarking noise robustness
- Predicting expectation values from circuit structure
- Hybrid quantum-classical ML pipelines
- Feature engineering from quantum circuits
### Dataset Overview
- **Samples**: 2000
- **Qubits**: 6
- **Depth**: 4
- **Circuit Families**: Mixed (HEA, RealAmplitudes, QFT, Efficient SU(2), Random)
- **Entanglement**: Full
- **Noise**: None (clean simulation)
- **Observables**: Z, X, Y in mixed mode (global + per‑qubit)
- **Shots**: 512
- **Splits**: Train (157) / Validation (26) / Test (17) — deterministic hash‑based
### What's Inside Each Sample
Each sample in the Parquet files contains:
- Raw and transpiled QASM representations
- Circuit adjacency matrix
- Detailed gate statistics (single‑qubit, two‑qubit, CX, H, RX, RY, RZ)
- Structural metrics: Gate entropy + Meyer‑Wallach entanglement
- Ideal expectation values for Z, X, Y (global and per‑qubit)
- Circuit family label and full generation metadata
- Deterministic split label (train/val/test)
### QSBench-Core: Quantum Circuit Complexity
**You don't need a PhD in Quantum Physics to use this dataset.** If you are a Data Scientist, ML Engineer, or AI Researcher, think of a quantum circuit as a **Computational Graph (DAG)** or a piece of **Code**. This dataset provides the raw structural blueprints of thousands of quantum algorithms.
### The ML Mission: Unsupervised Learning & Clustering
Since this dataset contains clean, ideal circuits (no noise), it is perfect for **Unsupervised Learning**.
Can you cluster these circuits into distinct "complexity classes" using K-Means or HDBSCAN? Can you build a Graph Neural Network (GNN) that learns the topology of these circuits?
### Dataset Anatomy (Features)
Think of these columns as your `X` features.
| Group | Column Name | What is it for ML? |
| :--- | :--- | :--- |
| **Meta** | `circuit_hash`, `split` | Unique IDs and train/test splits. |
| **Topology** | `adjacency` | The graph structure! A matrix showing how nodes (qubits) are connected. Perfect for GNNs. |
| **Code** | `qasm_raw` | The raw text of the algorithm. Great for NLP/LLM tasks. |
| **Complexity** | `depth`, `gate_entropy` | Tabular features indicating how "deep" and "random" the graph is. |
| **Weights** | `total_gates`, `cx_count` | Node/Edge counts. `cx_count` is the number of complex interactions. |
### Quick Start Idea
Try to run **PCA** on the numeric features (`depth`, `gate_entropy`, `cx_count`, `adj_density`) to visualize the "DNA" of quantum algorithms in 2D space.
### Load the Dataset
The dataset is stored in Parquet format inside the `data/shards/` folder. You can load it directly using the Hugging Face `datasets` library:
```python
from datasets import load_dataset
# Load the demo dataset (free)
dataset = load_dataset("QSBench/QSBench-Core-v1.0.0-demo", split="train")
# Inspect the first sample
print(dataset[0])
```
If you prefer to use `pandas`:
```python
import pandas as pd
# Load all Parquet shards from the data folder
df = pd.read_parquet("data/shards/*.parquet")
print(df.head())
```
### Example: Train a simple model on expectation values
```python
from sklearn.ensemble import RandomForestRegressor
import numpy as np
from datasets import load_dataset
# Load dataset
ds = load_dataset("QSBench/QSBench-Core-v1.0.0-demo")
# Use gate count as a simple feature
X_train = np.array([s["total_gates"] for s in ds["train"]]).reshape(-1, 1)
y_train = np.array([s["ideal_expval_Z_global"] for s in ds["train"]])
model = RandomForestRegressor(random_state=42)
model.fit(X_train, y_train)
# Evaluate on test set
X_test = np.array([s["total_gates"] for s in ds["test"]]).reshape(-1, 1)
y_test = np.array([s["ideal_expval_Z_global"] for s in ds["test"]])
score = model.score(X_test, y_test)
print(f"R² score: {score:.4f}")
```
For more advanced usage (e.g., using QASM strings, adjacency matrices), check the provided metadata files in the `meta/` folder.
### Repository Structure
The dataset is stored in the `main` branch and contains only the data files to ensure the Dataset Viewer works correctly:
```
QSBench-Core-v1.0.0-demo/
├── README.md # This file
└── data/ # Parquet shards (main data)
└── shards/
└── *.parquet
└── *.csv
```
All metadata files (coverage.json, schema.json, meta.json, data_card.md, etc.) are located in a separate branch called **`metadata`** to avoid interfering with the Dataset Viewer.
You can browse them here:
👉 [metadata branch](https://huggingface.co/datasets/QSBench/QSBench-Core-v1.0.0-demo/tree/metadata)
### Related QSBench Datasets
- QSBench Lite (20k samples, n=4)
- QSBench Core (75k samples, n=8)
- Depolarizing Noise Pack (150k samples)
- Amplitude Damping Pack (150k samples)
- Transpilation Hardware Pack (200k samples)
### Part of the QSBench Family
This is a small public **demo version**. Full‑scale datasets (20k–150k+ samples), noisy versions (Depolarizing, Amplitude Damping), and custom datasets are available.
[Repository](https://github.com/QSBench/QSBench-Core-v1.0.0-demo)
[Website & Full Catalog](https://qsbench.github.io)
**License**: CC BY‑NC 4.0 (Personal & Research Use)
**Questions or custom requests?** Visit our [website](https://qsbench.github.io) or open an issue on [GitHub](https://github.com/QSBench/QSBench-Core-v1.0.0-demo), or inspect the generation pipeline in the **QSBench Generator** [repository](https://github.com/QSBench/QSBench-Generator).
### Support QSBench
You can support the project directly on this Giveth page:
**[https://giveth.io/project/qsbench](https://giveth.io/project/qsbench)**
Your donations help us generate larger datasets, cover GPU costs, and continue developing new realistic noise models.
---
*Generated with QSBench Generator v5.0.2*
提供机构:
QSBench



