five

ThatsGroes/syntetisk-dialog-opsummering-raw

收藏
Hugging Face2025-01-03 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/ThatsGroes/syntetisk-dialog-opsummering-raw
下载链接
链接失效反馈
官方服务:
资源简介:
这个数据集包含了100万个合成的丹麦语对话和每个对话的摘要。数据集旨在用于微调小语言模型以生成对话摘要,稍作调整后也可用于训练语言模型以恢复/改进说话人识别、训练对话分类器或作为丹麦语嵌入模型训练数据的一部分。对话涵盖了来自两个数据集的近21000个不同主题,以及添加的一些手制客服主题。

This dataset consists of 1,000,000 synthetic Danish dialogues and a summary for each dialogue. The dataset is intended for fine-tuning small language models to generate dialogue summaries, but with minor adjustments it can also be used to train an LLM to restore/improve speaker diarization, to train a classifier for classifying dialogs into topics, or as part of the training data for a Danish embedding model. The dialogs cover nearly 21,000 different topics from two datasets and a number of hand-crafted customer service topics were added.
提供机构:
ThatsGroes
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作