NIL
收藏ieee-dataport.org2025-03-25 收录
下载链接:
https://ieee-dataport.org/documents/nil
下载链接
链接失效反馈官方服务:
资源简介:
Videos contain a high volume of texts and are broadcasted via different sources, such as television, the internet, etc. Since optical character recognition (OCR) engines are script-dependent, script identification is the precursor for them. Depending on the video sources, identification of video scripts is not trivial since we have difficult issues, such as low resolution, complex background, noise, blur effects, etc. In this work, a deep learning-based system named as LWSINet: LightWeight Script Identification Network (6-layered CNN) is proposed to identify the video scripts. For validation, we used a publicly available dataset named CVSI-15. Besides, the effects of the three common noises namely, Salt \& pepper, Gaussian, and Poisson were considered on the scripts along with their hybridized metamorphosis. In our test results, we observed that the proposed CNN is coherent and robust enough to identify scripts in scenarios: without and with noise. Further, we also employed other well-known handcrafted feature-based and deep learning techniques and obtained better results with the proposed framework
视频内容中包含大量文本,并通过电视、互联网等多元化渠道进行传播。鉴于光学字符识别(OCR)引擎对脚本具有依赖性,脚本识别便成为其前置条件。鉴于视频来源的多样性,视频脚本的识别并非易事,因为其中存在着诸如分辨率低、背景复杂、噪声、模糊效果等难题。在本研究中,我们提出了一种基于深度学习的系统,命名为LWSINet:轻量级脚本识别网络(6层卷积神经网络),旨在识别视频脚本。为了验证,我们采用了公开数据集CVSI-15。此外,我们还考虑了三种常见噪声(盐与胡椒噪声、高斯噪声和泊松噪声)及其混合形态对脚本的影响。在测试结果中,我们观察到所提出的卷积神经网络在无噪声和有噪声的场景下均表现出高度的一致性和鲁棒性。进一步地,我们还采用了其他著名的基于手工特征和深度学习的技术,并在所提出的框架下取得了更好的结果。
提供机构:
ieee-dataport.org



