five

Beyond Digital Boundaries: Examining Record Keeping Practices Through Handwritten Annotations and Strikeouts, 2006-2023

收藏
DataCite Commons2025-09-12 更新2026-05-06 收录
下载链接:
http://reshare.ukdataservice.ac.uk/id/eprint/857993
下载链接
链接失效反馈
官方服务:
资源简介:
The Malayalam Handwritten Script Dataset with Strikeouts is a comprehensive collection featuring diverse handwriting styles, various strikeout marks, corrections, and editorial annotations, making it an invaluable resource for researchers and developers working on Malayalam language processing and document analysis tasks. The dataset includes a wide range of handwritten Malayalam scripts, each with its unique characteristics, such as varying handwriting styles, strikeout marks, and corrections. The strikeout marks include single and double strikethroughs, underlined corrections, and other forms of annotations, while the corrections are made using various methods, including overwriting, insertion, and deletion. This dataset supports developing robust Optical Character Recognition (OCR) systems for handwritten Malayalam texts, enabling researchers to create models that can accurately recognize and interpret handwritten texts with corrections and annotations. Additionally, the dataset enhances Natural Language Processing (NLP) models for noisy or annotated texts, allowing researchers to develop models that can effectively process and analyze complex texts. The dataset also facilitates manuscript analysis, enabling researchers to study manuscript variations and editorial practices, gaining valuable insights into the writing and editing processes of Malayalam texts. By leveraging this dataset, researchers can develop and evaluate OCR systems, NLP models, and manuscript analysis techniques that can accurately process and interpret handwritten Malayalam texts with corrections and annotations. Ultimately, the dataset contributes to advancements in OCR, NLP, and manuscript analysis, making it a valuable resource for researchers and developers working on Malayalam language processing and document analysis tasks. The dataset's diverse scripts and annotations make it an ideal resource for developing robust models that can handle complex texts, and its potential applications are vast and varied.
提供机构:
UK Data Service
创建时间:
2025-09-12
二维码
社区交流群
二维码
科研交流群
商业服务