Nexdata/5147_Images_Japanese_Handwriting_OCR_data
收藏Hugging Face2024-04-16 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/Nexdata/5147_Images_Japanese_Handwriting_OCR_data
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-nd-4.0
---
## Description
5,147 Images Japanese Handwriting OCR Data. The text carrier are A4 paper, lined paper, quadrille paper, etc. The device is cellphone, the collection angle is eye-level angle. The dataset content includes Japanese composition, poetry, prose, news, stories, etc. For annotation, line-level quadrilateral bounding box annotation and transcription for the texts were annotated in the data.The dataset can be used for tasks such as Japanese handwriting OCR.
For more details, please refer to the link: https://www.nexdata.ai/dataset/1296?source=Huggingface
## Data size
5,147 images
## Population distribution
gender distribution: 244 males, 304 females; age distribution: 2 people under 18 years old, 494 people aged from 18 to 45 years old, 50 people aged from 46 to 60, 2 people over 60 years old; nationality distribution: Japan
## Collecting environment
A4 paper, lined paper, quadrille paper, etc.
## Device
cellphone
## Photographic angle
eye-level angle
## Data format
the image data format is .jpg, the annotation file format is .json
## Data content
including Japanese composition, poetry, prose, news, stories, etc.
## Annotation content
line-level quadrilateral bounding box annotation and transcription for the texts
## Accuracy
the collection content accuracy is not less than 97%; the texts transcription accuracy is not less than 97%
# Licensing Information
Commercial License
提供机构:
Nexdata
原始信息汇总
数据集概述
基本信息
- 数据集名称:日本手写文字OCR数据集
- 数据量:5,147张图像
- 许可证:CC-BY-NC-ND-4.0
数据内容
- 内容类型:包括日本作文、诗歌、散文、新闻、故事等
- 文本载体:A4纸、线条纸、方格纸等
- 设备:手机
- 拍摄角度:眼平角度
数据格式
- 图像格式:.jpg
- 标注文件格式:.json
标注内容
- 标注类型:行级四边形边界框标注及文本转录
- 准确性:收集内容准确性不低于97%;文本转录准确性不低于97%
人口分布
- 性别分布:男性244人,女性304人
- 年龄分布:18岁以下2人,18至45岁494人,46至60岁50人,60岁以上2人
- 国籍分布:日本



