sdcfsdsfsdds/synthdog-en
收藏Hugging Face2025-12-06 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/sdcfsdsfsdds/synthdog-en
下载链接
链接失效反馈官方服务:
资源简介:
## Donut 🍩 : OCR-Free Document Understanding Transformer (ECCV 2022) -- SynthDoG datasets
For more information, please visit https://github.com/clovaai/donut

The links to the SynthDoG-generated datasets are here:
- [`synthdog-en`](https://huggingface.co/datasets/naver-clova-ix/synthdog-en): English, 0.5M.
- [`synthdog-zh`](https://huggingface.co/datasets/naver-clova-ix/synthdog-zh): Chinese, 0.5M.
- [`synthdog-ja`](https://huggingface.co/datasets/naver-clova-ix/synthdog-ja): Japanese, 0.5M.
- [`synthdog-ko`](https://huggingface.co/datasets/naver-clova-ix/synthdog-ko): Korean, 0.5M.
To generate synthetic datasets with our SynthDoG, please see `./synthdog/README.md` and [our paper](#how-to-cite) for details.
## How to Cite
If you find this work useful to you, please cite:
```bibtex
@inproceedings{kim2022donut,
title = {OCR-Free Document Understanding Transformer},
author = {Kim, Geewook and Hong, Teakgyu and Yim, Moonbin and Nam, JeongYeon and Park, Jinyoung and Yim, Jinyeong and Hwang, Wonseok and Yun, Sangdoo and Han, Dongyoon and Park, Seunghyun},
booktitle = {European Conference on Computer Vision (ECCV)},
year = {2022}
}
```
# Donut 🍩:无光学字符识别文档理解Transformer(OCR-Free Document Understanding Transformer,ECCV 2022)—— SynthDoG数据集
如需获取更多信息,请访问:https://github.com/clovaai/donut

SynthDoG生成数据集的下载链接如下:
- [`synthdog-en`](https://huggingface.co/datasets/naver-clova-ix/synthdog-en):英语数据集,共50万条样本
- [`synthdog-zh`](https://huggingface.co/datasets/naver-clova-ix/synthdog-zh):中文数据集,共50万条样本
- [`synthdog-ja`](https://huggingface.co/datasets/naver-clova-ix/synthdog-ja):日语数据集,共50万条样本
- [`synthdog-ko`](https://huggingface.co/datasets/naver-clova-ix/synthdog-ko):韩语数据集,共50万条样本
若需使用SynthDoG生成合成数据集,请参阅`./synthdog/README.md`文件以及[我们的论文](#how-to-cite)获取详细说明。
## 引用方式
若您认为本研究对您的工作有所帮助,请引用如下文献:
bibtex
@inproceedings{kim2022donut,
title = {无光学字符识别文档理解Transformer(OCR-Free Document Understanding Transformer)},
author = {Kim, Geewook and Hong, Teakgyu and Yim, Moonbin and Nam, JeongYeon and Park, Jinyoung and Yim, Jinyeong and Hwang, Wonseok and Yun, Sangdoo and Han, Dongyoon and Park, Seunghyun},
booktitle = {欧洲计算机视觉大会(ECCV)},
year = {2022}
}
提供机构:
sdcfsdsfsdds



