Jianshu001/arabic-daily-smoke-v3-50
收藏Hugging Face2026-04-24 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Jianshu001/arabic-daily-smoke-v3-50
下载链接
链接失效反馈官方服务:
资源简介:
Smoke 50 — v3 pipeline (chat-style + thinking cleanup)数据集包含50条记录,这些记录是通过新的处理流程生成的。处理流程包括使用Gemma生成器(v3聊天风格系统提示)、Gemma重写器清理每个助手的思考内容以及基本清理(如用户文本中无表情符号、无审计字段)。数据集的语言为阿拉伯语,许可证为mit,标签包括arabic、chat-style和gemma-4,规模类别为n<1K。
The Smoke 50 — v3 pipeline (chat-style + thinking cleanup) dataset contains 50 records generated through a new pipeline. The pipeline includes Gemma generation with v3 chat-style system prompts, Gemma-as-rewriter cleanup on every assistant thinking, and basic cleanup (no emojis in user text, no audit fields). The dataset is in Arabic (ar), licensed under mit, tagged with arabic, chat-style, and gemma-4, and categorized as n<1K in size.
提供机构:
Jianshu001



