five

petre-bit/BloodshotNet-Dataset

收藏
Hugging Face2026-04-15 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/petre-bit/BloodshotNet-Dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - object-detection tags: - blood-detection - violence-detection - forensics - yolo - nsfw - bloodshotnet size_categories: - 10K<n<100K --- > ⚠️ **NSFW & GRAPHIC CONTENT WARNING:** > This dataset contains highly graphic, violent, and sensitive imagery, including simulated and real blood, serious injury, surgical scenes, and gore. Discretion is strongly advised before downloading, viewing, or utilizing this data. <div align="center"> <img src="figures/bit-studio-logo-yellow.svg" alt="Bit Studio" height="120"> <h1>BloodshotNet Dataset</h1> <a href="https://wearebit.com/"><img src="https://img.shields.io/badge/Website-Bit_Studio-yellow.svg" alt="Bit Studio Homepage"></a> &nbsp; <a href="https://huggingface.co/your-username/BloodshotNet"><img src="https://img.shields.io/badge/🤗_Model-BloodshotNet-ff9d00.svg" alt="Model Weights"></a> &nbsp; <a href="https://creativecommons.org/licenses/by/4.0/"><img src="https://img.shields.io/badge/License-CC_BY_4.0-lightgrey.svg" alt="License: CC BY 4.0"></a> </div> ## Dataset Summary The `BloodshotNet-Dataset` is the official, large-scale, aggregated computer vision dataset designed to train **BloodshotNet** (a YOLO-based blood detection model). It contains 23,514 images curated from 16 public datasets, refined to include realistic positive samples (e.g., forensics, movie scenes) and challenging "hard negative" samples (e.g., red clothing, flowers, red vehicles) to prevent model overfitting and reduce false positives. A key feature of this dataset is its **dual-compatibility**: it is structured to work completely out-of-the-box for YOLO (v11/v26) training, while also being fully native to the Hugging Face `datasets` library for general PyTorch/TensorFlow pipelines. ## Dual-Compatibility & How to Use ### 1. For YOLO Users (Plug & Play) The dataset preserves standard YOLO formatting. You can clone this repository directly and start training immediately using the provided `data.yaml`. * **Images:** Located in `images/train/`, `images/val/`, `images/test/` * **Labels:** Located in `labels/train/`, `labels/val/`, `labels/test/` (Standard normalized YOLO `.txt` files) * **Negative Images:** Background/negative images simply have empty `.txt` files in the `labels/` directory. ```bash # Example YOLO training command yolo task=detect mode=train data=path/to/BloodshotNet-Dataset/data.yaml model=yolo11n.pt epochs=100 ``` ### 2. For Hugging Face Users You can load this dataset seamlessly into your Python environment with a single line of code. The dataset contains `metadata.jsonl` files in the image directories that automatically map the YOLO annotations into standard absolute pixel bounding boxes `[x_min, y_min, width, height]`. ```Python from datasets import load_dataset # Loads the dataset with absolute bounding box coordinates dataset = load_dataset("petre-bit/BloodshotNet-Dataset") ``` ## Dataset Structure **Data Splits** - **Train**: 80% (18,809 images) - **Validation**: 15% (3,528 images) - **Test**: 5% (1,177 images) **Composition & Classes** - **60% Positive Images (Blood)**: Forensic blood spatter, UFC fights, gore/horror movie scenes, bloody prints, surgery scenes, etc. - **40% Negative Images (Non-Blood)**: Visually confusing red objects and contexts (red dresses, red flowers, brake lights, crowded scenes, kitchen objects). - **Classes**: 1 Class (`0: blood`) ## Preprocessing & Annotation Adjustments - **Label Filtering**: All non-blood labels present in the original source datasets were strictly removed to isolate the `blood` class. - **Bounding Box Conversion**: The source datasets contained a mix of bounding boxes and segmentation masks. All segmentation polygons were converted into tight bounding boxes by extracting the extreme `(x, y)` coordinate limits to ensure a standardized object detection format. - **No Resizing Applied**: Images retain their original, diverse resolutions. Any resizing (e.g., to 640x640) should be handled dynamically by the model during the training/inference pipeline. ## Intended Use & Limitations **Intended Use**: Research and development in forensics, automated content moderation (flagging violent media), and safety monitoring systems. **Limitations**: The dataset relies heavily on cinematic representations of blood and specific forensic datasets, which may not encompass all real-world lighting conditions, surface textures, or scenarios. ## About the Creators & Attribution This dataset was assembled, preprocessed, and formatted by the team at [Bit](https://wearebit.com/) to support the development of robust content moderation and detection systems. ### Data Sources This dataset is an aggregation of 16 datasets originally hosted on Roboflow Universe. All original datasets are licensed under the [Creative Commons Attribution 4.0 International (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/) license. Huge thanks to the creators: - blood_segmentation by [blood-3pyjx](https://universe.roboflow.com/blood-3pyjx/blood_segmentation/dataset/1) - blooddetection-j2wid by [orkun-lpkdc](https://universe.roboflow.com/orkun-lpkdc/blooddetection-j2wid/dataset/3) - cars-cars-cars-gh8ga by [yasins-workspace-qbvyv](https://universe.roboflow.com/yasins-workspace-qbvyv/cars-cars-cars-gh8ga/dataset/1) - crime-data-st by [harsha-cujyl](https://universe.roboflow.com/harsha-cujyl/crime-data-st/dataset/1) - danger-place-detection by [md-hasibul-islam](https://universe.roboflow.com/md-hasibul-islam/danger-place-detection/dataset/4) - dress-jhov8 by [animal-detection-q3wq1](https://universe.roboflow.com/animal-detection-q3wq1/dress-jhov8/dataset/1) - flowers-pciqg by [my-workspace-ebwf4](https://universe.roboflow.com/my-workspace-ebwf4/flowers-pciqg/dataset/1) - forensicvision-du5uz by [forensicvision](https://universe.roboflow.com/forensicvision/forensicvision-du5uz/dataset/1) - horror-content-detector-0rv9z by [myproject-zxfbp](https://universe.roboflow.com/myproject-zxfbp/horror-content-detector-0rv9z/dataset/1) - kitchen-gt6wi by [abinavn](https://universe.roboflow.com/abinavn/kitchen-gt6wi/dataset/2) - movie-ywprp by [quanle-shsvi](https://universe.roboflow.com/quanle-shsvi/movie-ywprp/dataset/5) - passive-and-transfer-stains by [thesis-epiei](https://universe.roboflow.com/thesis-epiei/passive-and-transfer-stains/dataset/2) - rbrelabel by [tasfagvasd](https://universe.roboflow.com/tasfagvasd/rbrelabel/dataset/4) - sexual_content-za0gn by [helmiworkshop-6o1xm](https://universe.roboflow.com/helmiworkshop-6o1xm/sexual_content-za0gn) - two-guo-2 by [ownfallprincess](https://universe.roboflow.com/ownfallprincess/two-guo-2/dataset/2) - video_modera by [videomoderation](https://universe.roboflow.com/videomoderation/video_modera/dataset/1) **License:** Released under CC BY 4.0.
提供机构:
petre-bit
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作