Hindi Verse Dataset
收藏Mendeley Data2026-04-09 收录
下载链接:
https://data.mendeley.com/datasets/cp6htsbbpp
下载链接
链接失效反馈官方服务:
资源简介:
This dataset was constructed by collecting numerous Hindi verses. Between December 2017 and January 2021, 3330 Unicode Transformation Format (UTF-8) based text data was collected and stored in Tab Separated Value (tsv) Files. It is divided into two sections. The raw data is in the first section. The second section contains the analyzed data, which is categorized using a specific automatic metadata generator based on Hindi verse writing norms. The raw data and the analyzed data are stored in separate folders. The readme.txt file contains additional information regarding file naming conventions.
本数据集通过采集海量印地语诗歌文本构建而成。2017年12月至2021年1月期间,共收集得到3330条基于Unicode转换格式(UTF-8)的文本数据,并存储于制表符分隔值(TSV)文件中。该数据集分为两个部分:第一部分为原始数据,第二部分为经分析处理后的数据;后者依据印地语诗歌创作规范,通过专用自动元数据生成器完成分类。原始数据与分析后的数据分别存储于独立文件夹中。readme.txt文件包含有关文件命名规范的补充说明。
提供机构:
Dr Milind Audichya



