five

uhhlt/amharic-stopwords

收藏
Hugging Face2024-08-01 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/uhhlt/amharic-stopwords
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: text dtype: string license: apache-2.0 language: - am pretty_name: Amharic Stopwords size_categories: - 1K<n<10K --- # Amharic Stopwords The stopwords are build by [Yimam et al. (2021)](https://www.mdpi.com/1999-5903/13/11/275) at [LT Group](https://huggingface.co/uhhlt), University of Hamburg, Germany. Initially, they were generated using an automated approach based on frequency, followed by manual validation. ### Source - **GitHub** https://github.com/uhh-lt/ethiopicmodels - **Dataset:** https://github.com/uhh-lt/ethiopicmodels/blob/master/am/normalization/amstopwords.txt - **Paper:** https://www.mdpi.com/1999-5903/13/11/275 For citing this stopwords, please use the following: ``` @Article{fi13110275, AUTHOR = {Yimam, Seid Muhie and Ayele, Abinew Ali and Venkatesh, Gopalakrishnan and Gashaw, Ibrahim and Biemann, Chris}, TITLE = {Introducing Various Semantic Models for Amharic: Experimentation and Evaluation with Multiple Tasks and Datasets}, JOURNAL = {Future Internet}, VOLUME = {13}, YEAR = {2021}, NUMBER = {11}, ARTICLE-NUMBER = {275}, URL = {https://www.mdpi.com/1999-5903/13/11/275}, ISSN = {1999-5903}, DOI = {10.3390/fi13110275} }
提供机构:
uhhlt
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作