BDRC/KhyentseWangpo
收藏Hugging Face2025-10-09 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/BDRC/KhyentseWangpo
下载链接
链接失效反馈官方服务:
资源简介:
这是一个藏文线到文本的数据集,用于藏文OCR(光学字符识别)。数据集包含13527行,每行有三列:唯一标识符、包含藏文文本行的图像文件以及以Unicode格式转录的藏文文本。该数据集由佛教数字资源中心策划,支持藏文OCR系统的开发,特别是用于数字化藏文佛教文献。
A line-to-text dataset for Tibetan OCR. The dataset consists of 13,527 rows with three columns: an `id` (string) as a unique identifier for each line, an `image` (image) containing a line of Tibetan text, and a `transcription` (string) of the Tibetan text in Unicode format. Curated by the Buddhist Digital Resource Center, this dataset supports the development of OCR systems for Tibetan texts, particularly for digitizing the corpus of Tibetan Buddhist literature.
提供机构:
BDRC



