alikhan126/loato-bench-artifacts
收藏Hugging Face2026-03-21 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/alikhan126/loato-bench-artifacts
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- text-classification
language:
- en
tags:
- prompt-injection
- security
- embeddings
- loato
- capstone
size_categories:
- 10K<n<100K
---
# LOATO-Bench Artifacts
Pre-computed embeddings, experiment results, and dataset files for the **LOATO-Bench** project — studying cross-attack generalization of embedding-based prompt injection classifiers.
**GitHub repo**: [alikhan126/loato-bench](https://github.com/alikhan126/loato-bench)
## What's in this repo
| Path | Size | Description |
|------|------|-------------|
| `embeddings/minilm/` | 107 MB | all-MiniLM-L6-v2 (384d) embeddings for 68,845 samples |
| `embeddings/bge_large/` | 283 MB | BGE-large-en-v1.5 (1024d) embeddings |
| `embeddings/instructor/` | 283 MB | Instructor-large (1024d) embeddings |
| `embeddings/openai_small/` | 424 MB | text-embedding-3-small (1536d) embeddings |
| `embeddings/e5_mistral/` | 1.1 GB | E5-Mistral-7B GGUF Q4 (4096d) embeddings |
| `results/experiments/` | 256 KB | 30 experiment result JSONs (5 models × 3 classifiers × 2 protocols) |
| `data/processed/labeled_v1.parquet` | 11 MB | Final labeled dataset (68,845 samples: 40,017 benign / 28,828 injection) |
| `data/processed/unified_dataset.parquet` | 11 MB | Pre-labeling harmonized dataset |
| `data/splits/` | 6 MB | Train/test split indices for all 4 evaluation protocols |
## Why this exists
The embedding step takes **10+ hours** to run from scratch (E5-Mistral alone takes ~8 hours on an M3 Pro). By hosting pre-computed artifacts here, anyone can reproduce the full experiment pipeline in minutes instead of hours.
## How to use
### Option 1: Download script (recommended)
From the [loato-bench](https://github.com/alikhan126/loato-bench) repo:
```bash
# Set your HF token in .env
echo "HF_TOKEN=your_token_here" >> .env
# Download everything (~2.2 GB)
uv run python scripts/download_artifacts.py
# Or download selectively
uv run python scripts/download_artifacts.py --only embeddings
uv run python scripts/download_artifacts.py --only results
uv run python scripts/download_artifacts.py --only data
```
### Option 2: Python API
```python
from huggingface_hub import snapshot_download
snapshot_download(
repo_id="alikhan126/loato-bench-artifacts",
repo_type="dataset",
local_dir="./artifacts",
token="your_token_here", # or set HF_TOKEN env var
)
```
### Option 3: Git clone
```bash
git lfs install
git clone https://huggingface.co/datasets/alikhan126/loato-bench-artifacts
```
## Embedding format
Each embedding is stored as a compressed NumPy file (`.npz`):
```python
import numpy as np
data = np.load("embeddings/minilm/embeddings.npz")
embeddings = data["embeddings"] # shape: (68845, dim)
sample_ids = data["sample_ids"] # shape: (68845,)
```
The `meta.json` sidecar contains model name, dimensions, sample count, and a text hash for cache validation.
## Experiment results format
Each JSON file contains per-fold metrics (F1, accuracy, AUC-ROC, precision, recall) for a specific embedding × classifier × experiment combination:
```
results/experiments/{experiment}_{embedding}_{classifier}.json
```
Example: `loato_e5_mistral_mlp.json` = E5-Mistral embeddings + MLP classifier under LOATO protocol.
## Dataset
68,845 samples from 9 public sources:
- **Injection** (28,828): Open-Prompt-Injection, HackAPrompt, PINT/Gandalf, Deepset
- **Benign** (40,017): Dolly 15K, Alpaca (cleaned), OASST1, WildChat (nontoxic)
Labeled with a 3-tier taxonomy (source maps → regex → GPT-4o-mini) into 7 attack categories.
## License
MIT — Academic use (Pace University MS Data Science Capstone).
提供机构:
alikhan126



