Automatic Collation for Diversifying Corpora (ACDC) Data
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10546444
下载链接
链接失效反馈官方服务:
资源简介:
ACDC Project, Data:
One of OpenITI’s major deliverables in our most recent round of work is the Automatic Collation for Diversifying Corpora (ACDC) project and ensuing tool, which we are now making available for wider use and experimentation: the relevant code and instructions for installation and use are available on Github, and additional data and documentation will be forthcoming. ACDC is one component in our multi-pronged strategy to significantly boost handwritten text recognition for Arabic script as manifest within the manuscript tradition, a tradition of text production that dates from around the time of the Qur’an’s codification all the way down to the present (in some parts of the Islamicate world manuscript production for various purposes remains very much alive!).
For more information about the project, please use the following links:
Automatic Collation for Diversify Corpora (ACDC) Tutorial: a video introduction and tutorial to the project and the ensuing tool, which explains the logic of the project and gives a detailed walk-through of use
Automatic Collation for Diversifying Corpora: Commonly Copied Texts as Distant Supervision for Handwritten Text Recognition
Introducing the ACDC Project, Part I: Training Data Production and the Diversity of the Islamicate Manuscript Tradition
创建时间:
2024-01-23



