Jianshu001/arabic-conversation-v2
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Jianshu001/arabic-conversation-v2
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- ar
task_categories:
- text-generation
tags:
- arabic
- synthetic
- multi-turn
- islamic-finance
- healthcare
- education
- energy
- real-estate
- government-services
size_categories:
- n<1K
---
# Arabic Multi-Domain Conversations v2
20 synthetic multi-turn Arabic conversations across 6 UAE/Middle East domains.
## Domains
| Domain | Arabic | Count |
|--------|--------|-------|
| Islamic Finance | التمويل الإسلامي | 4 |
| Real Estate | العقارات | 4 |
| Energy | الطاقة | 4 |
| Education | التعليم | 3 |
| Government Services | الخدمات الحكومية | 3 |
| Healthcare | الرعاية الصحية | 2 |
## Generation Method
- **User side**: Claude Sonnet 4.6 role-playing diverse personas with natural, human-like prompts
- **Assistant side**: Claude Sonnet 4.6 with controlled tone (neutral, informational, no heavy markdown)
- **Factuality check**: o3
- **Quality check**: truncation detection
## Quality Controls (v2 improvements)
- **No markdown formatting**: 0% of assistant messages contain ## headings (vs ~100% in v1)
- **Neutral tone**: No authoritative/decisive language ("you must", "let me be clear")
- **No emotional persuasion**: No phrases like "your words touched my heart"
- **Hallucination mitigation**: Prompted to acknowledge uncertainty instead of inventing facts
- **Professional disclaimer**: Medical/legal/financial topics include one-time disclaimer
- **No repetitive openers**: No "great question" or repeated "let me be honest"
- **Natural user prompts**: Avg 184 chars (vs 400+ in v1), with colloquial touches
## Stats
- 20 conversations, 81 user messages, 81 assistant messages
- Average user message: 184 chars
- 3-5 turns per conversation
## Format
JSONL with fields: id, domain, domain_ar, topic, topic_ar, subtopic_ar, persona, conversation, metadata, factuality
提供机构:
Jianshu001



