five

Jianshu001/arabic-conversation-v2

收藏
Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Jianshu001/arabic-conversation-v2
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - ar task_categories: - text-generation tags: - arabic - synthetic - multi-turn - islamic-finance - healthcare - education - energy - real-estate - government-services size_categories: - n<1K --- # Arabic Multi-Domain Conversations v2 20 synthetic multi-turn Arabic conversations across 6 UAE/Middle East domains. ## Domains | Domain | Arabic | Count | |--------|--------|-------| | Islamic Finance | التمويل الإسلامي | 4 | | Real Estate | العقارات | 4 | | Energy | الطاقة | 4 | | Education | التعليم | 3 | | Government Services | الخدمات الحكومية | 3 | | Healthcare | الرعاية الصحية | 2 | ## Generation Method - **User side**: Claude Sonnet 4.6 role-playing diverse personas with natural, human-like prompts - **Assistant side**: Claude Sonnet 4.6 with controlled tone (neutral, informational, no heavy markdown) - **Factuality check**: o3 - **Quality check**: truncation detection ## Quality Controls (v2 improvements) - **No markdown formatting**: 0% of assistant messages contain ## headings (vs ~100% in v1) - **Neutral tone**: No authoritative/decisive language ("you must", "let me be clear") - **No emotional persuasion**: No phrases like "your words touched my heart" - **Hallucination mitigation**: Prompted to acknowledge uncertainty instead of inventing facts - **Professional disclaimer**: Medical/legal/financial topics include one-time disclaimer - **No repetitive openers**: No "great question" or repeated "let me be honest" - **Natural user prompts**: Avg 184 chars (vs 400+ in v1), with colloquial touches ## Stats - 20 conversations, 81 user messages, 81 assistant messages - Average user message: 184 chars - 3-5 turns per conversation ## Format JSONL with fields: id, domain, domain_ar, topic, topic_ar, subtopic_ar, persona, conversation, metadata, factuality
提供机构:
Jianshu001
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作