Dhiadev-tn/tunisian-darija-english
收藏Hugging Face2026-03-27 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Dhiadev-tn/tunisian-darija-english
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-sa-4.0
task_categories:
- translation
language:
- ar
- en
tags:
- tunisian-darija
- arabizi
- nlp
- low-resource
- translation
- dialect
pretty_name: 'Tunisian Darija-English Dataset '
size_categories:
- n<1K
---
# Tunisian Darija-to-English Dataset
The first hand-crafted Tunisian Darija parallel dataset.
Built by a native Tunisian speaker.
## About
- 120 sentence pairs
- 12 categories: greetings, farewells, family,
food & drinks, shopping & money, time & directions,
emotions & feelings, compliments & insults,
school & studying, health & illness,
Tunisian slang, Tunisian proverbs
- Written and validated by a native speaker
- Arabizi format (Latin + numeric markers 3/7/9/5)
## Why This Exists
No clean Tunisian Darija dataset existed.
This is the first step toward changing that.
Phase 2 will expand to 3,000-5,000 pairs
through field collection across Tunisia.
## Author
Dhia Azizi (@dhiadev-tn)
GitHub: https://github.com/Dhiadev-tn/darija-translator
提供机构:
Dhiadev-tn



