five

intrect/artifactbench

收藏
Hugging Face2026-04-20 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/intrect/artifactbench
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-4.0 task_categories: - audio-classification tags: - ai-music-detection - benchmark - forensic - audio language: - en size_categories: - 1K<n<10K --- # ArtifactBench v1 — AI-Generated Music Detection Benchmark A multi-generator evaluation benchmark for AI-generated music forensic detection, covering 22 AI generators and 6 real music sources. ## Dataset Description - **Total tracks**: 8,766 (4,383 AI + 4,383 Real, 1:1 balanced) - **AI generators**: 22 (MusicGen, Stable Audio, Suno v3/v3.5/v4, Udio, Riffusion, DiffRhythm, Yue, Chirp v2/v3/v3.5, etc.) - **Real sources**: 6 (SONICS, MoM, FMA, YouTube) - **Format**: AI tracks as Parquet (audio bytes embedded), Real tracks as CSV (YouTube IDs for user download) ## Motivation Existing benchmarks (SONICS: 5 generators, MoM: 6 generators) only measure in-distribution performance. Models reporting high F1 on these benchmarks fail catastrophically on out-of-distribution generators: - CLAM (194M params, F1=0.925 on MoM) → F1=0.824 on ArtifactBench - SpecTTTra (19M params, F1=0.97 on SONICS) → F1=0.766 on ArtifactBench ArtifactBench evaluates what matters for deployment: generalization across diverse generators. ## Sanity Check Protocol Per-source pass/fail thresholds: - Real source FPR ≤ 5% - AI source TPR ≥ 90% (Stable Audio: ≥ 60%) - Codec invariance: mean Δ ≤ 0.15, max Δ ≤ 0.35 ## Baseline Results | Model | Params | F1 | FAIL | Suno v4 TPR | Real FPR | |---|---|---|---|---|---| | **ArtifactNet v9.4** | **4.2M** | **0.983** | **4/28** | **98%** | **1.5%** | | CLAM (MoM) | 194M | 0.824 | 16/28 | 78% | 70.5% | | SpecTTTra | 19M | 0.766 | 23/28 | 55% | 21.4% | ## Usage ```python from artifactbench.bench import main # or # python -m artifactbench.bench --model artifactnet --manifest artifactbench_v1_manifest.json ``` ## Per-Source Breakdown (v1.0.1) | Source | Class | Tracks | bench_origin: test | Generator | |---|---|---|---|---| | aime_musicgen_large | AI | 200 | 30 | MusicGen Large | | aime_musicgen_medium | AI | 200 | 30 | MusicGen Medium | | aime_musicgen_small | AI | 200 | 30 | MusicGen Small | | aime_riffusion | AI | 200 | 30 | Riffusion | | aime_stable_audio_v1 | AI | 200 | 50 | Stable Audio v1 | | aime_stable_audio_v2 | AI | 200 | 50 | Stable Audio v2 | | aime_suno_v3 | AI | 200 | 30 | Suno v3 | | aime_suno_v35 | AI | 200 | 30 | Suno v3.5 | | aime_udio | AI | 200 | 30 | Udio (AIME) | | mom_diffrythm | AI | 200 | 100 | DiffRhythm | | mom_riffusion | AI | 200 | 100 | Riffusion (MoM) | | mom_udio | AI | 200 | 100 | Udio (MoM) | | mom_yue | AI | 200 | 100 | Yue | | sonics_chirp-v2-xxl-alpha | AI | 200 | 80 | Chirp v2 | | sonics_chirp-v3 | AI | 200 | 80 | Chirp v3 | | sonics_chirp-v3.5 | AI | 200 | 80 | Chirp v3.5 | | sonics_udio-120s | AI | 200 | 80 | Udio 120s | | sonics_udio-30s | AI | 200 | 80 | Udio 30s | | suno_cdn_latest | AI | 200 | 100 | Suno CDN (post-freeze) | | suno_extra | AI | 200 | 80 | Suno extras | | **udio_cdn_latest** | AI | **200** | 35 | Udio CDN (post-freeze) — v1.0.1 balanced | | udio_extra | AI | 200 | 80 | Udio extras | | sonics_real | Real | 500 | 300 | SONICS real partition | | mom_real | Real | 400 | 200 | MoM real (mp3 + wav) | | fma_hardneg | Real | 300 | 150 | FMA mp3 hard-negatives | | mom_extra_real | Real | 200 | 110 | MoM extra real | | mom_real_wav | Real | 200 | 42 | MoM real WAV variants | | youtube_hardneg | Real | 200 | 73 | YouTube curated hard-negatives | | **TOTAL** | — | **6,200** | **2,280** | 28 sources, 22 AI generators | Real sources are intentionally over-represented (1,800 total) to enable rigorous FPR estimation across diverse codec and production conditions. ## Files - `artifactbench_v1_manifest.json` — Track manifest with bench_origin tags - `metadata.json` — Dataset statistics and generator list ## Citation ```bibtex @article{oh2026artifactnet, title = {ArtifactNet: Detecting AI-Generated Music via Forensic Residual Physics}, author = {Oh, Heewon}, journal = {arXiv preprint arXiv:2604.16254}, year = {2026}, eprint = {2604.16254}, archivePrefix= {arXiv}, primaryClass = {cs.SD}, doi = {10.48550/arXiv.2604.16254}, url = {https://arxiv.org/abs/2604.16254} } ``` **arXiv**: [2604.16254](https://arxiv.org/abs/2604.16254) · **DOI**: [10.48550/arXiv.2604.16254](https://doi.org/10.48550/arXiv.2604.16254) ## License CC BY-NC 4.0
提供机构:
intrect
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作