five

Zyroxx66/Somali-Somlish-Instruct-2K-Dataset

收藏
Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Zyroxx66/Somali-Somlish-Instruct-2K-Dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - so - en license: apache-2.0 size_categories: - 1K<n<10K task_categories: - text-generation - question-answering tags: - somali - somlish - instruct - tech - discord --- # Somlish-Tech-Instruct-2K This is the first-of-its-kind **Somlish** (Somali + English) instruction-tuning dataset. It contains 2,312 rows of high-quality synthetic data generated to teach AI models how to speak like a modern Somali tech enthusiast. ## 🌟 Why this exists Standard Somali datasets are often too formal. This dataset uses natural "Discord-style" slang (Niyo, Sxb, Bro) while maintaining English technical terms (API, GPU, React) to ensure the AI stays smart and logical. ## 📊 Dataset Structure Each row follows the standard Instruction/Output format: - **Instruction:** A technical question or request in Somlish or English. - **Output:** A helpful, brotherly response in Somlish. ## 🚀 Recommended Use Perfect for fine-tuning small models (2B - 7B) like Qwen, Gemma, and Phi-3 to create Somali-language chatbots. Created by: **Zyroxx66**
提供机构:
Zyroxx66
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作