Thermostatic/sat-captchas-v1
收藏Hugging Face2026-03-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Thermostatic/sat-captchas-v1
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- image-classification
language:
- en
pretty_name: SAT Captchas V1
size_categories:
- 100K<n<1M
---
# SAT Captchas V1
This dataset archive contains CAPTCHA images from the SAT CFDI verification flow targeted by the `sat-captcha-solver` project.
## Release Context
As of the morning of March 25, 2026, the specific CAPTCHA targeted by this project had been deprecated. That deprecation is the reason the related code and this dataset archive were made public.
## Contents
- `images.tar`: archive containing the CAPTCHA image files under `images/`
- `labels.csv`: filename-to-label manifest
## Format
`labels.csv` has the schema:
```text
filename,label
10010_5c195ffcb437.jpg,10010
```
Each `filename` entry maps to a `.jpg` inside the `images/` directory contained in `images.tar`.
## Extraction
```bash
tar -xf images.tar
```
## Notes
- This repository stores the dataset as an archive to keep the Hugging Face upload practical and to avoid the operational overhead of publishing more than 135k individual files.
- The corresponding code repository is the public `sat-captcha-solver` project.
许可证: MIT许可证
任务类别:
- 图像分类
语言:
- 英语
展示名称: SAT验证码V1
大小类别:
- 100K<n<1M
# SAT验证码V1
本数据集归档文件包含针对`sat-captcha-solver`项目所瞄准的SAT CFDI验证流程的验证码(CAPTCHA)图像。
## 发布背景
截至2026年3月25日上午,本项目所瞄准的特定验证码已被弃用,本次相关代码与本数据集归档文件公开的原因正是该验证码的弃用。
## 数据集内容
- `images.tar`: 包含`images/`目录下所有验证码图像文件的归档包
- `labels.csv`: 文件名与标签的对应清单
## 数据格式
`labels.csv`的格式如下:
text
filename,label
10010_5c195ffcb437.jpg,10010
每条`filename`条目对应`images.tar`内`images/`目录下的一张`.jpg`格式图像文件。
## 解压方法
bash
tar -xf images.tar
## 说明
- 本仓库将数据集存储为归档文件,以优化Hugging Face平台的上传效率,并避免发布超过135k个单独文件所带来的运维开销。
- 对应的代码仓库为公开的`sat-captcha-solver`项目。
提供机构:
Thermostatic



