HausaNLP/Naija-Lex
收藏Naija-Lexicons 数据集概述
数据集描述
- 项目关联: Naija-Lexicons 是 Naija-Senti 项目的一部分。
- 内容: 收集自尼日利亚四种最广泛使用的语言——豪萨语、伊博语、尼日利亚皮钦语和约鲁巴语的停用词列表。
数据集详情
语言
- 豪萨语 (hau)
- 伊博语 (ibo)
- 约鲁巴语 (yor)
数据结构
数据实例
- 格式: 每个语言的词汇实例列表及其情感标签。
- 示例结构: json { "word": "string", "label": "string" }
使用方法
- 加载数据集: python from datasets import load_dataset ds = load_dataset("HausaNLP/Naija-Lexicons", "hau")
附加信息
数据集许可证
- 许可证: Creative Commons Attribution BY-NC-SA 4.0 International License
引用信息
@inproceedings{muhammad-etal-2022-naijasenti, title = "{N}aija{S}enti: A {N}igerian {T}witter Sentiment Corpus for Multilingual Sentiment Analysis", author = "Muhammad, Shamsuddeen Hassan and Adelani, David Ifeoluwa and Ruder, Sebastian and Ahmad, Ibrahim Sa{}id and Abdulmumin, Idris and Bello, Bello Shehu and Choudhury, Monojit and Emezue, Chris Chinenye and Abdullahi, Saheed Salahudeen and Aremu, Anuoluwapo and Jorge, Al{\i}pio and Brazdil, Pavel", booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference", month = jun, year = "2022", address = "Marseille, France", publisher = "European Language Resources Association", url = "https://aclanthology.org/2022.lrec-1.63", pages = "590--602", }




