five

AI4D - African Language Dataset Challenge

收藏
arXiv2020-07-23 更新2024-06-21 收录
下载链接:
https://zindi.africa/competitions/ai4d-african-language-dataset-challenge
下载链接
链接失效反馈
官方服务:
资源简介:
AI4D - African Language Dataset Challenge是由国际发展研究中心等机构发起的数据集创建竞赛,旨在促进非洲语言数据集的创建和组织。该数据集包含多种非洲语言的文本数据,来源于新闻网站、社交媒体等数字化资源,数据量庞大,Tokens数众多。创建过程中,鼓励多学科团队合作,确保数据的质量和代表性。该数据集主要用于支持自然语言处理研究,特别是针对低资源语言的机器学习模型训练,以缩小数字技术在这些语言中的应用差距。

The AI4D - African Language Dataset Challenge is a dataset creation competition launched by institutions including the International Development Research Centre, which aims to promote the creation and curation of African language datasets. This dataset encompasses textual data across multiple African languages, sourced from digital resources such as news websites and social media platforms, boasting a substantial volume of data and a large number of Tokens. During its development, multidisciplinary team collaboration is encouraged to ensure the dataset's quality and representativeness. This dataset is primarily designed to support Natural Language Processing (NLP) research, especially the training of machine learning models for low-resource languages, with the goal of narrowing the gap in the deployment of digital technologies for these languages.
提供机构:
国际发展研究中心
创建时间:
2020-07-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作