French Call Center Transcripts - 140K Words
收藏Snowflake2026-03-23 更新2026-03-27 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZSYZVBO4S
下载链接
链接失效反馈官方服务:
资源简介:
**Dataset Overview**
Nearly 140K words (750K Chars) of high-quality, human-validated French Agent/Customer conversations captured from real-world, multi-domain call center interactions. The dataset reflects authentic customer service dialogue across industries, featuring natural speech patterns, real problem-solving flows, emotional variability, and domain-specific language that cannot be replicated through synthetic or scripted data.
It is specifically designed for training and fine-tuning large language models, conversational agents, and enterprise automation solutions.
**Our datasets include:**
- Structured multi-turn dialogues
- Speaker-separated conversations (agent and customer)
- Domain-specific business interactions
- Cleaned and normalized text for direct model ingestion
- Compliance-ready data sourcing
**Key AI use cases**
- LLM fine-tuning for customer support copilots
- Conversational AI and voice/chat agents
- Automated ticket classification and routing
- Sentiment and churn prediction models
- Compliance monitoring and quality assurance automation
This is a free, content-limited trial. For access to the full dataset, please reach out to our team.
提供机构:
DATAmundi
创建时间:
2025-12-15
原始信息汇总
French Call Center Transcripts - 140K Words 数据集概述
数据集基本信息
- 数据集名称: French Call Center Transcripts - 140K Words
- 提供商: DATAmundi
- 访问模式: 免费试用 (Free to try)
- 数据产品类型: 静态数据产品 (Static data product)
数据集内容与规模
- 数据量: 近14万单词(75万字符)的高质量、经过人工验证的法语客服/客户对话。
- 数据来源: 采集自真实世界、多领域的呼叫中心互动。
- 数据特征: 包含真实的客户服务对话,涵盖多个行业,具有自然的语音模式、真实的问题解决流程、情感变化以及特定领域的语言,这些无法通过合成或脚本数据复制。
- 数据格式: 结构化多轮对话、说话人分离的对话(客服和客户)、特定领域的业务互动、经过清理和归一化的文本,可直接供模型使用。
- 合规性: 符合合规要求的数据源。
数据表结构 (FR_FR_CALL_CENTRE_TRANSCRIPTIONS)
列名:
- CONTENT
- LANGUAGE
- LOUDNESSLEVEL
- PRIMARYTYPE
- SEGMENTID
- SEGMENTLANGUAGES
- SPEAKERID
- TRANSCRIPT_DOMAIN
- TRANSCRIPT_ID
- TRANSCRIPT_LANGUAGES
- TRANSCRIPT_NAME
- TRANSCRIPT_TOPIC
- START_T (Float 类型)
- END_T (Float 类型)
数据预览示例 (部分字段):
- 语言 (LANGUAGE): fr_FR
- 响度级别 (LOUDNESSLEVEL): Normal
- 主要类型 (PRIMARYTYPE): Speech
- 转录领域 (TRANSCRIPT_DOMAIN): Call-center
- 转录名称 (TRANSCRIPT_NAME): MULTI_SPEAKER_LONG_FORM_TRANSCRIPTION
- 转录主题 (TRANSCRIPT_TOPIC): 例如 Others, Study, Retail
主要AI应用场景
- 用于客户支持副驾驶的LLM微调。
- 对话式AI和语音/聊天代理。
- 自动工单分类和路由。
- 情感和客户流失预测模型。
- 合规监控和质量保证自动化。
设计目的
该数据集专为训练和微调大语言模型、对话代理和企业自动化解决方案而设计。
试用说明
这是一个免费的、内容有限的试用版。要访问完整数据集,请联系DATAmundi团队。
提供商其他相关数据集 (DATAmundi)
- French Canadian Call Center Transcripts - 20K Words
- English AU Call Center Human-Validated Transcripts - 570K words
- Italian Call Center Transcripts - 170K Words
- Portuguese Brazilian Call Center Transcripts - 320K words
- Korean Call Center Transcripts - 170K Chars
- Hindi Call Center Transcripts - 400K Words
类别
- AI & ML
- Data Engineering
- Machine Learning
云区域可用性 (AWS)
- Africa (Cape Town)
- Asia Pacific (Jakarta)
- Asia Pacific (Mumbai)
- Asia Pacific (Osaka)
- (及其他49个区域)
法律条款
- Standard
提供商联系方式
- 销售: sales@datamundi.ai
- 支持: https://datamundi.ai/contact/



