AI4D - African Language Dataset Challenge
收藏arXiv2020-07-23 更新2024-06-21 收录
下载链接:
https://zindi.africa/competitions/ai4d-african-language-dataset-challenge
下载链接
链接失效反馈官方服务:
资源简介:
AI4D - African Language Dataset Challenge是由国际发展研究中心等机构发起的数据集创建竞赛,旨在促进非洲语言数据集的创建和组织。该数据集包含多种非洲语言的文本数据,来源于新闻网站、社交媒体等数字化资源,数据量庞大,Tokens数众多。创建过程中,鼓励多学科团队合作,确保数据的质量和代表性。该数据集主要用于支持自然语言处理研究,特别是针对低资源语言的机器学习模型训练,以缩小数字技术在这些语言中的应用差距。
The AI4D - African Language Dataset Challenge is a dataset creation competition launched by institutions including the International Development Research Centre, which aims to promote the creation and curation of African language datasets. This dataset encompasses textual data across multiple African languages, sourced from digital resources such as news websites and social media platforms, boasting a substantial volume of data and a large number of Tokens. During its development, multidisciplinary team collaboration is encouraged to ensure the dataset's quality and representativeness. This dataset is primarily designed to support Natural Language Processing (NLP) research, especially the training of machine learning models for low-resource languages, with the goal of narrowing the gap in the deployment of digital technologies for these languages.
提供机构:
国际发展研究中心
创建时间:
2020-07-23



