AfnanSD/saudi_eou_dataset
收藏Hugging Face2025-12-12 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/AfnanSD/saudi_eou_dataset
下载链接
链接失效反馈官方服务:
资源简介:
这是一个用于结束话语(EOU)检测的沙特阿拉伯语数据集,由ChatGPT生成。数据集包含三个版本,最终使用的是saudi_eou_dataset_flat版本。数据集包含沙特阿拉伯语话语、方言(Najdi、Hijazi、Qassimi、Jizani、Asiri、Haili等)和EOU标签(1表示结束话语,0表示继续)。数据集专为训练实时对话AI的轮换模型设计,包含500行数据,无时间戳,无重复话语,自动标记,且包含3种沙特方言。
These datasets were generated with ChatGPT to use for end-of-utterance (EOU) detection. The repo includes 3 datasets, with the final used version being saudi_eou_dataset_flat. The dataset contains Saudi Arabic utterances, dialects (Najdi, Hijazi, Qassimi, Jizani, Asiri, Haili, etc.), and EOU labels (1 = end of utterance, 0 = continuation). Designed for training real-time conversational AI turn-taking models, the dataset has 500 rows, no timestamps, no duplicated utterances, is automatically labeled, and includes 3 Saudi dialects.
提供机构:
AfnanSD



