five

sdcfsdsfsdds/synthdog-en

收藏
Hugging Face2025-12-06 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/sdcfsdsfsdds/synthdog-en
下载链接
链接失效反馈
官方服务:
资源简介:
## Donut 🍩 : OCR-Free Document Understanding Transformer (ECCV 2022) -- SynthDoG datasets For more information, please visit https://github.com/clovaai/donut ![image](https://github.com/clovaai/donut/blob/master/misc/sample_synthdog.png?raw=true) The links to the SynthDoG-generated datasets are here: - [`synthdog-en`](https://huggingface.co/datasets/naver-clova-ix/synthdog-en): English, 0.5M. - [`synthdog-zh`](https://huggingface.co/datasets/naver-clova-ix/synthdog-zh): Chinese, 0.5M. - [`synthdog-ja`](https://huggingface.co/datasets/naver-clova-ix/synthdog-ja): Japanese, 0.5M. - [`synthdog-ko`](https://huggingface.co/datasets/naver-clova-ix/synthdog-ko): Korean, 0.5M. To generate synthetic datasets with our SynthDoG, please see `./synthdog/README.md` and [our paper](#how-to-cite) for details. ## How to Cite If you find this work useful to you, please cite: ```bibtex @inproceedings{kim2022donut, title = {OCR-Free Document Understanding Transformer}, author = {Kim, Geewook and Hong, Teakgyu and Yim, Moonbin and Nam, JeongYeon and Park, Jinyoung and Yim, Jinyeong and Hwang, Wonseok and Yun, Sangdoo and Han, Dongyoon and Park, Seunghyun}, booktitle = {European Conference on Computer Vision (ECCV)}, year = {2022} } ```

# Donut 🍩:无光学字符识别文档理解Transformer(OCR-Free Document Understanding Transformer,ECCV 2022)—— SynthDoG数据集 如需获取更多信息,请访问:https://github.com/clovaai/donut ![image](https://github.com/clovaai/donut/blob/master/misc/sample_synthdog.png?raw=true) SynthDoG生成数据集的下载链接如下: - [`synthdog-en`](https://huggingface.co/datasets/naver-clova-ix/synthdog-en):英语数据集,共50万条样本 - [`synthdog-zh`](https://huggingface.co/datasets/naver-clova-ix/synthdog-zh):中文数据集,共50万条样本 - [`synthdog-ja`](https://huggingface.co/datasets/naver-clova-ix/synthdog-ja):日语数据集,共50万条样本 - [`synthdog-ko`](https://huggingface.co/datasets/naver-clova-ix/synthdog-ko):韩语数据集,共50万条样本 若需使用SynthDoG生成合成数据集,请参阅`./synthdog/README.md`文件以及[我们的论文](#how-to-cite)获取详细说明。 ## 引用方式 若您认为本研究对您的工作有所帮助,请引用如下文献: bibtex @inproceedings{kim2022donut, title = {无光学字符识别文档理解Transformer(OCR-Free Document Understanding Transformer)}, author = {Kim, Geewook and Hong, Teakgyu and Yim, Moonbin and Nam, JeongYeon and Park, Jinyoung and Yim, Jinyeong and Hwang, Wonseok and Yun, Sangdoo and Han, Dongyoon and Park, Seunghyun}, booktitle = {欧洲计算机视觉大会(ECCV)}, year = {2022} }
提供机构:
sdcfsdsfsdds
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作