five

Nilanjan-2002/fashion-second-hand-front-only-rgb

收藏
Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Nilanjan-2002/fashion-second-hand-front-only-rgb
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: image dtype: image - name: brand dtype: string - name: usage dtype: string - name: condition dtype: int64 - name: type dtype: string - name: category dtype: string - name: price dtype: string - name: trend dtype: string - name: colors dtype: string - name: cut dtype: string - name: pattern dtype: string - name: season dtype: string - name: text dtype: string - name: pilling dtype: int64 - name: damage dtype: string - name: stains dtype: string - name: holes dtype: string - name: smell dtype: string - name: material dtype: string splits: - name: train num_bytes: 6264676808.824 num_examples: 28248 - name: test num_bytes: 281469483.32 num_examples: 3390 download_size: 2843670317 dataset_size: 6546146292.144 configs: - config_name: default data_files: - split: train path: data/train-* - split: test path: data/test-* license: cc-by-4.0 language: - en - sv pretty_name: second-ha --- # Clothing Dataset for Second-Hand Fashion <!-- Provide a quick summary of the dataset. --> This dataset contains only the front image and labels from **version 3** of the following dataset released on zenodo: [Clothing Dataset for Second-Hand Fashion](https://zenodo.org/records/13788681) Three changes were made: - **Front image**: Only front image is uploaded here. Back and brand image are not. - **Background removal**: The background from the front image was removed using [BiRefNet](https://huggingface.co/ZhengPeng7/BiRefNet), which only supports up to 1024x1024 images - larger images were resized. The background removal is not perfect - some artifacts remain. - **Rotation**: The images were rotated to have a vertical orientation. Our internal experiments showed that just re-orienting the images can boost zero-shot performance. The following contains most of the details copied from the original source at zenodo. ## Code ```python from datasets import load_dataset # Load the dataset dataset = load_dataset("fnauman/fashion-second-hand-front-only-rgb") # Access the training split train_dataset = dataset["train"] # Print basic information print(f"Dataset size: {len(train_dataset)} images") # 28248 print(f"Features: {train_dataset.features}") # 19 # Access an example example = train_dataset[0] image = example["image"] # # Display the image - notebook # from IPython.display import display # display(example["image"]) print(f"Brand: {example['brand']}, Category: {example['category']}") # Output: Brand: Soc (stadium), Category: Ladies ``` ## Dataset Details ### Dataset Description <!-- Provide a longer summary of what this dataset is. --> The dataset originates from projects focused on the sorting of used clothes within a sorting facility. The primary objective is to classify each garment into one of several categories to determine its ultimate destination: reuse, reuse outside Sweden (export), recycling, repair, remake, or thermal waste. The dataset has **31,638** clothing items, a massive update from the 3,000 items in version 1. The dataset collection started under the Vinnova funded project "AI for resource-efficient circular fashion" in Spring, 2022 and involves collaboration among three institutions: RISE Research Institutes of Sweden AB, Wargön Innovation AB, and Myrorna AB. The dataset has received further support through the EU project, CISUTAC (cisutac.eu). - **Data collected by:** [Wargön Innovation AB](https://wargoninnovation.se/), [Myrorna AB](https://www.myrorna.se/) - **Curation, cleaning and release by**: [RISE Research Institutes of Sweden AB](https://www.ri.se/en) - **Funded by:** [Vinnova](https://www.vinnova.se/en/p/ai-for-resource-efficient-circular-fashion/), [CISUTAC - EU Horizon](https://www.cisutac.eu/) - **License:** CC-BY 4.0 ### Dataset Sources <!-- Provide the basic links for the dataset. --> - **Repository:** [Clothing Dataset for Second-Hand Fashion](https://zenodo.org/records/13788681) ## Uses <!-- Address questions around how the dataset is intended to be used. --> - Usage prediction in sorting facilities. - Detection of attributes that are specific to used or second-hand garments: condition scale (1-5), stains, holes, etc. <!-- [More Information Needed] --> <!-- ### Out-of-Scope Use --> <!-- This section addresses misuse, malicious use, and uses that the dataset will not work well for. --> <!-- [More Information Needed] --> ## Dataset Structure <!-- This section provides a description of the dataset fields, and additional information about the dataset structure such as criteria used to create the splits, relationships between data points, etc. --> - The dataset contains 31,638 clothing items, each with a unique item ID in a datetime format. The items are divided into three stations: `station1`, `station2`, and `station3`. The `station1` and `station2` folders contain images and annotations from Wargön Innovation AB, while the `station3` folder contains data from Myrorna AB. Each clothing item has three images and a JSON file containing annotations. - Three images are provided for each clothing item: 1. Front view. 2. Back view. 3. Brand label close-up. About 4000-5000 brand images are missing because of privacy concerns: people's hands, faces, etc. Some clothing items did not have a brand label to begin with. - Image resolutions are primarily in two sizes: `1280x720` and `1920x1080`. The background of the images is a table that used a measuring tape prior to January 2023, but later images have a square grid pattern with each square measuring `10x10` cm. - Each JSON file contains a list of annotations, some of which require nuanced interpretation (see `labels.py` for the options): - `usage`: Arguably the most critical label, usage indicates the garment's intended pathway. Options include 'Reuse,' 'Repair,' 'Remake,' 'Recycle,' 'Export' (reuse outside Sweden), and 'Energy recovery' (thermal waste). About 99% of the garments fall into the 'Reuse,' 'Export,' or 'Recycle' categories. - `trend`: This field refers to the general style of the garment, not a time-dependent trend as in some other datasets (e.g., Visuelle 2.0). It might be more accurately labeled as 'style.' - `material`: Material annotations are mostly based on the readings from a Near Infrared (NIR) scanner and in some cases from the garment's brand label. - Damage-related attributes include: - `condition` (1-5 scale, 5 being the best) - `pilling` (1-5 scale, 5 meaning no pilling) - `stains`, `holes`, `smell` (each with options 'None,' 'Minor,' 'Major'). Note: 'holes' and 'smell' were introduced after November 17th, 2022, and stains previously only had 'Yes'/'No' options. For `station1` and `station2`, we introduced additional damage location labels to assist in damage detection: "damageimage": "back", "damageloc": "bottom left", "damage": "stain ", "damage2image": "front", "damage2loc": "None", "damage2": "", "damage3image": "back", "damage3loc": "bottom right", "damage3": "stain" Taken from `labels_2024_04_05_08_47_35.json` file. Additionally, we annotated a few hundred images with bounding box annotations that we aim to release at a later date. - `comments`: The comments field is mostly empty, but sometimes contains important information about the garment, such as a detailed text description of the damage. - Whenever possible, ISO standards have been followed to define these attributes on a 1-5 scale (e.g., `pilling`). - Gold dataset: 100 garments were annotated multiple times by different annotators for **annotator agreement comparisons**. These 100 garments are placed inside a separate folder `test100`. - The data has been annotated by a group of expert second-hand sorters at Wargön Innovation AB and Myrorna AB. - Some attributes, such as `price`, should be considered with caution. Many distinct pricing models exist in the second-hand industry: - Price by weight - Price by brand and demand (similar to first-hand fashion) - Generic pricing at a fixed value (e.g., 1 Euro or 10 SEK) Wargön Innovation AB does not set the prices in practice and their prices are suggestive only (`station1` and `station2`). Myrorna AB (`station3`), in contrast, does resale and sets the prices. ## Citation Nauman, F. (2024). Clothing Dataset for Second-Hand Fashion (Version 3) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.13788681
提供机构:
Nilanjan-2002
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作