ThatsGroes/syntetisk-dialog-opsummering-raw
收藏Hugging Face2025-01-03 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/ThatsGroes/syntetisk-dialog-opsummering-raw
下载链接
链接失效反馈官方服务:
资源简介:
这个数据集包含了100万个合成的丹麦语对话和每个对话的摘要。数据集旨在用于微调小语言模型以生成对话摘要,稍作调整后也可用于训练语言模型以恢复/改进说话人识别、训练对话分类器或作为丹麦语嵌入模型训练数据的一部分。对话涵盖了来自两个数据集的近21000个不同主题,以及添加的一些手制客服主题。
This dataset consists of 1,000,000 synthetic Danish dialogues and a summary for each dialogue. The dataset is intended for fine-tuning small language models to generate dialogue summaries, but with minor adjustments it can also be used to train an LLM to restore/improve speaker diarization, to train a classifier for classifying dialogs into topics, or as part of the training data for a Danish embedding model. The dialogs cover nearly 21,000 different topics from two datasets and a number of hand-crafted customer service topics were added.
提供机构:
ThatsGroes



