ashleyscruse/noble-ai-evidence-benchmark

Name: ashleyscruse/noble-ai-evidence-benchmark
Creator: ashleyscruse
Published: 2026-04-21 11:43:46
License: 暂无描述

Hugging Face2026-04-21 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/ashleyscruse/noble-ai-evidence-benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-4.0 task_categories: - image-classification tags: - ai-detection - synthetic-image-detection - law-enforcement - benchmark - image-forensics size_categories: - 1K<n<10K --- # NOBLE AI-Generated Evidence Detection Benchmark A benchmark dataset for evaluating AI-generated image detection tools on law enforcement content (bodycam, surveillance, evidence photos). ## Dataset Description Existing AI image detection tools are trained on social media content. They have never been tested on the grainy, compressed, low-quality images typical of law enforcement contexts. This dataset fills that gap. ### Sources | Type | Count | Source | |------|-------|--------| | Real images | 422 | Open Images V7 (validation split) | | Synthetic images | 495 | Generated with 3 diffusion models | | **Total** | **917** | | ### Real Image Categories | Category | Count | |----------|-------| | People | 93 | | Vehicles | 81 | | Indoor scenes | 81 | | Outdoor scenes | 82 | | Objects | 85 | ### Synthetic Image Generators | Generator | Count | Model ID | |-----------|-------|----------| | Stable Diffusion 1.5 | 165 | runwayml/stable-diffusion-v1-5 | | OpenJourney v4 | 165 | prompthero/openjourney-v4 | | Realistic Vision 5.1 | 165 | SG161222/Realistic_Vision_V5.1_noVAE | ### Degradation Levels Each image appears at 3 quality levels simulating real-world law enforcement conditions: | Level | Parameters | Simulates | |-------|-----------|-----------| | Clean | None | High-quality digital photos | | Moderate | JPEG Q50 + blur sigma=1 + contrast 0.8x | Decent surveillance footage | | Heavy | JPEG Q30 + downscale 50% + noise sigma=25 + blur sigma=2 | Poor bodycam / old CCTV | ## Dataset Structure ``` raw/ real/{people,vehicles,indoor_scenes,outdoor_scenes,objects}/ synthetic/{surveillance_security,evidence_style,bodycam_style,documents}/ processed/ clean/{real,synthetic}/ moderate/{real,synthetic}/ heavy/{real,synthetic}/ results/ metrics/ figures/ ``` ## Preliminary Results Evaluation of the HuggingFace AI image detector (umm-maybe/AI-image-detector) on this benchmark: | Quality Level | Accuracy | F1 | AUC-ROC | |---------------|----------|----|---------| | Clean | 44.5% | 37.4% | 0.402 | | Moderate | 45.1% | 17.1% | 0.381 | | Heavy | 34.5% | 18.2% | 0.265 | The detector performs **worse than random chance** on law enforcement content, with accuracy degrading further as image quality decreases. ## Uses - Benchmarking AI-generated image detection tools on domain-specific content - Studying the effect of image degradation on detection accuracy - Training improved detection models for law enforcement contexts ## Citation If you use this dataset, please cite: ``` @misc{scruse2026noble, title={NOBLE AI-Generated Evidence Detection Benchmark}, author={Scruse, Ashley and Gosha, Kinnis}, year={2026}, publisher={HuggingFace}, } ``` ## Funding This work is funded by NOBLE (National Organization of Black Law Enforcement Executives), $25,000 research grant. ## Contact **PI:** Dr. Ashley Scruse, Morehouse College

提供机构：

ashleyscruse

5,000+

优质数据集

54 个

任务类型

进入经典数据集