vrclc/english-malayalam-names-clean
收藏Hugging Face2024-12-17 更新2024-12-21 收录
下载链接:
https://hf-mirror.com/datasets/vrclc/english-malayalam-names-clean
下载链接
链接失效反馈官方服务:
资源简介:
这个数据集包含27787044个人的名字,这些名字同时以英语和马拉雅拉姆语提供。数据集的来源是政府发布的各种选举名册。主要用途包括英语和马拉雅拉姆语名字的音译任务、命名实体识别以及人名识别。数据集的结构包括两个主要特征:ml(马拉雅拉姆语名字)和en(英语名字)。数据集的许可证是CC-BY-SA-4.0。
This dataset contains 27787044 person names both in English and Malayalam. The source for this dataset is various election roles published by the Government. The main uses include English and Malayalam name transliteration tasks, named entity recognition, and person name recognition. The dataset structure includes two main features: ml (Malayalam names) and en (English names). The dataset is licensed under CC-BY-SA-4.0.
提供机构:
vrclc



