five

Automatic Collation for Diversifying Corpora (ACDC) Data

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10546444
下载链接
链接失效反馈
官方服务:
资源简介:
ACDC Project, Data: One of OpenITI’s major deliverables in our most recent round of work is the Automatic Collation for Diversifying Corpora (ACDC) project and ensuing tool, which we are now making available for wider use and experimentation: the relevant code and instructions for installation and use are available on Github, and additional data and documentation will be forthcoming. ACDC is one component in our multi-pronged strategy to significantly boost handwritten text recognition for Arabic script as manifest within the manuscript tradition, a tradition of text production that dates from around the time of the Qur’an’s codification all the way down to the present (in some parts of the Islamicate world manuscript production for various purposes remains very much alive!). For more information about the project, please use the following links: Automatic Collation for Diversify Corpora (ACDC) Tutorial: a video introduction and tutorial to the project and the ensuing tool, which explains the logic of the project and gives a detailed walk-through of use Automatic Collation for Diversifying Corpora: Commonly Copied Texts as Distant Supervision for Handwritten Text Recognition Introducing the ACDC Project, Part I: Training Data Production and the Diversity of the Islamicate Manuscript Tradition
创建时间:
2024-01-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作