five

maleselalegodi/South-Africa-Presidential-Speeches-Text-and-NLP-Dataset

收藏
Hugging Face2024-12-20 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/maleselalegodi/South-Africa-Presidential-Speeches-Text-and-NLP-Dataset
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含多种南非语言的总统声明,适用于自然语言处理(NLP)和机器翻译任务,特别是针对低资源语言。数据集分为未组织和已组织两部分,分别包含不同语言的声明和翻译。此外,还提供了用于数据收集和处理的脚本文件。数据集的结构为JSON格式,每个文件包含唯一的声明ID及其对应的多语言翻译。

This dataset contains South African presidential statements in multiple South African languages, spanning from February 9, 2012, to October 9, 2023. These statements are highly valuable for Natural Language Processing (NLP) and Machine Translation tasks, especially for low-resource languages. The dataset includes unorganized and organized dataset folders, as well as annotation files describing the nature of the data and scripts for data scraping. The dataset is structured as a JSON object, with keys representing unique statement IDs, and translations aligned across languages, facilitating easy use in NLP tasks.
提供机构:
maleselalegodi
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作