cazonai/autoradio-destilado-v2
收藏Hugging Face2026-04-23 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/cazonai/autoradio-destilado-v2
下载链接
链接失效反馈官方服务:
资源简介:
Elyra 0.2 — Dataset Destilado是一个用于文本生成和对话任务的数据集,包含原始样本、多教师合成样本、自玩对话样本和单教师样本,总计21,314个唯一样本。数据来源于364个巴西电台的真实转录,每个转录生成了约20个问答对。数据格式为ChatML,包含系统提示、用户内容和助手内容。数据集用于训练Gemma 3 4B模型。
Elyra 0.2 — Dataset Destilado is a dataset for text generation and conversational tasks, containing original samples, multi-teacher synthetic samples, self-play conversation samples, and single-teacher samples, totaling 21,314 unique samples. The data is derived from 364 real transcriptions of Brazilian radio stations, with each transcription generating approximately 20 Q/A pairs. The data format is ChatML, including system prompts, user content, and assistant content. The dataset is used to train the Gemma 3 4B model.
提供机构:
cazonai



