mangesh-ux/logistics-cx-transcript-analysis-chatml
收藏Hugging Face2026-03-28 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/mangesh-ux/logistics-cx-transcript-analysis-chatml
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: OmniCX Logistics CX Dataset
language:
- en
license: cc-by-4.0
task_categories:
- text-generation
- text-classification
task_ids:
- text2text-generation
size_categories:
- n<1K
---
# OmniCX Logistics CX Dataset (Research Preview)
## Table of Contents
- [Dataset Summary](#dataset-summary)
- [Supported Tasks](#supported-tasks)
- [Languages](#languages)
- [Dataset Structure](#dataset-structure)
- [Data Instances](#data-instances)
- [Schema Highlights](#schema-highlights)
- [Data Splits](#data-splits)
- [Dataset Creation](#dataset-creation)
- [Source Data](#source-data)
- [Knowledge-Source Derivation (Important)](#knowledge-source-derivation-important)
- [Annotation Process](#annotation-process)
- [Quality Control](#quality-control)
- [Limitations](#limitations)
- [Bias, Risks, and Safety](#bias-risks-and-safety)
- [Recommended Uses](#recommended-uses)
- [Out-of-Scope Uses](#out-of-scope-uses)
- [Licensing](#licensing)
- [Citation](#citation)
## Dataset Summary
This dataset is designed for structured extraction of logistics and customer-experience (CX) signals from multi-turn support conversations.
Each record uses ChatML-style messages with:
- a fixed `system` instruction
- a `user` transcript
- an `assistant` JSON payload matching `LogisticsCXMetrics`
This release is a **research preview** and should not be treated as a production-certified benchmark.
Project repository: [OmniCX-Extractor](https://github.com/mangesh-ux/OmniCX-Extractor)
### Taxonomy Summary
`LogisticsCXMetrics` contains three top-level groups:
- **`behavioral_analytics`**: intent, effort (`1-5` CES-like rubric), sentiment trajectory, rework frequency, and friction evidence quotes.
- **`operational_analytics`**: exception diagnosis, controlled exception/root-cause categories, deterministic boolean flags, and resolution tracking.
- **`diagnostic_reasoning`**: auditable reasoning fields (`intent_reasoning`, `exception_reasoning`, `effort_reasoning`) plus routing recommendation.
Core controlled vocabularies include customer intent families, rework bands (`0`, `1`, `2+`), sentiment trajectory (`Improved`, `Worsened`, `Unchanged`), and root-cause families.
## Supported Tasks
- Structured information extraction from support transcripts
- Multi-label analytics extraction
- Schema-constrained generation
## Languages
- English (`en`)
## Dataset Structure
### Data Instances
Each row is one JSON object:
```json
{
"messages": [
{"role": "system", "content": "You are a SOTA Logistics AI. Extract the exact logistics and CX metrics from the following transcript."},
{"role": "user", "content": "Agent: ... Customer: ..."},
{"role": "assistant", "content": "{\"behavioral_analytics\": {...}, \"operational_analytics\": {...}, \"diagnostic_reasoning\": {...}}"}
]
}
```
### Schema Highlights
The assistant JSON contains three required sections:
- `behavioral_analytics`
- `operational_analytics`
- `diagnostic_reasoning`
Field definitions and enums are implemented in `src/schema.py`.
Detailed taxonomy/rubric reference:
- [`docs/taxonomy.md`](https://github.com/mangesh-ux/OmniCX-Extractor/blob/main/docs/taxonomy.md)
## Data Splits
Current repository artifacts include:
- training JSONL in `data/processed/`
- evaluation JSONL in `data/eval/`
For Hugging Face release, publish explicit split files:
- `train.jsonl`
- `validation.jsonl` (optional but recommended)
- `test.jsonl`
## Dataset Creation
### Source Data
This project primarily uses synthetic logistics support conversations and synthetic labels generated through controlled prompting and schema validation.
Synthetic generation models used in this repository:
- Transcript generation: `gpt-4o-mini` (`src/data_factory.py`)
- Schema-constrained label extraction: `gpt-4o-mini` (`src/extractor.py`)
### Knowledge-Source Derivation (Important)
The output structure and taxonomy are derived from curated reference material in `docs/knowledge/`, including:
- `Transcript-Only CX Difficulty Score_ Standards, Methods, and a Rigorous MVP Design.pdf`
Deep-research document (ChatGPT-generated) focused on transcript-only CX friction and effort signals, including rework, escalation cues, sentiment volatility, unresolved follow-up markers, and rubric design for difficulty/effort estimation.
- `Logistics CX Data Schema Development.docx`
NotebookLM-assisted research and design artifact focused on logistics intent taxonomy and schema structuring, used to refine intent families, enum boundaries, and extraction-ready field definitions.
Field definitions, enum choices, and diagnostic categories in assistant JSON are grounded in these source documents and enforced through `LogisticsCXMetrics` validation (`src/schema.py`).
### Annotation Process
Labels are represented as structured JSON targeting the `LogisticsCXMetrics` schema and include behavioral, operational, and reasoning components.
### Quality Control
- format validation for JSONL integrity
- required-key checks for schema completeness
- parseability checks for assistant JSON content
- iterative cleanup scripts for malformed examples
## Limitations
- Small dataset size in current iteration
- Distribution mismatch risk versus real support logs
- Strict exact-match scoring may understate semantically-correct outputs
- Not calibrated for legal/compliance decisions
## Bias, Risks, and Safety
- Synthetic generation may encode stylistic bias from prompting models
- Root-cause and effort labels can reflect rubric bias
- Outputs should be human-reviewed for operational actions
- Not intended for automated denial/escalation adjudication
## Recommended Uses
- Research on schema-constrained extraction
- Prototyping CX analytics pipelines
- Error analysis and model behavior studies
## Out-of-Scope Uses
- Fully autonomous customer adjudication
- Legal/regulatory decisions without human oversight
- Claims/payment decision automation
## Licensing
This card is written assuming `CC-BY-4.0` for dataset artifacts. Confirm and publish your final legal choice in both:
- dataset repo license metadata
- repository `LICENSE` file
## Citation
```bibtex
@dataset{omnicx_logistics_cx_preview,
title = {OmniCX Logistics CX Dataset (Research Preview)},
author = {Mangesh Gupta},
year = {2026},
publisher = {Hugging Face},
note = {Synthetic logistics CX extraction dataset}
}
```
提供机构:
mangesh-ux
搜集汇总
数据集介绍

构建方式
在物流客户体验分析领域,OmniCX Logistics CX Dataset的构建采用了合成数据生成与结构化标注相结合的方法。数据集通过GPT-4o-mini模型生成模拟物流客服对话的文本,并依据《Logistics CX Data Schema Development.docx》等知识源文件定义的分类体系,利用同一模型进行模式约束下的信息提取,生成符合LogisticsCXMetrics模式的JSON标注。整个流程通过格式验证、必需键检查及可解析性校验确保数据质量,形成了包含行为分析、运营分析与诊断推理三个维度的结构化输出。
特点
该数据集的核心特点在于其高度结构化的输出模式与精细的领域分类体系。数据集遵循ChatML消息格式,将每段对话与一个严格遵循LogisticsCXMetrics模式的JSON负载配对,该模式涵盖了行为分析、运营分析与诊断推理三大模块,并嵌入了客户意图、再处理频次、情感轨迹、异常根因等受控词汇表。这种设计不仅支持从多轮对话中提取复杂的客户体验信号,还通过可审计的推理字段增强了结果的可解释性,为研究模式约束生成与多标签分析提供了标准化的评估基准。
使用方法
数据集适用于从支持对话转录本中进行结构化信息提取的研究与原型开发。使用者可加载JSONL格式的训练与评估文件,利用系统指令、用户转录本和助理JSON响应的标准三元组结构,训练或评估模型在给定对话文本后生成符合预定模式的JSON输出。典型应用场景包括构建客户体验分析流水线原型、研究模式约束下的生成模型行为,或进行错误分析。使用时需注意数据规模较小且为合成生成,应避免将其直接用于生产决策或自动化裁决场景。
背景与挑战
背景概述
在客户体验(CX)与物流运营智能分析领域,精准提取多轮对话中的结构化信号是提升服务效率与洞察质量的关键。OmniCX物流客户体验数据集由研究者Mangesh Gupta于2026年构建并发布,旨在为物流支持对话提供细粒度的信息抽取基准。该数据集聚焦于从合成生成的客服对话中,系统性地解析行为分析、运营分析与诊断推理三大核心维度,其设计依托于严谨的知识源推导与模式验证,为自然语言处理中的结构化生成与多标签分类任务提供了新的研究范本。
当前挑战
该数据集致力于解决物流客户体验分析中复杂信号的结构化抽取问题,其核心挑战在于如何准确捕捉对话中隐含的客户意图、努力程度、情感轨迹及异常根因等多维度交互特征。在构建过程中,研究者面临合成数据与真实对话分布可能失配的风险,同时需确保生成的标签严格遵循预定义的模式与枚举边界,并克服小规模数据迭代带来的泛化性局限。此外,严格的结构约束与精确匹配评估可能无法充分反映语义正确的输出,为模型的鲁棒性评估增添了难度。
常用场景
经典使用场景
在物流与客户体验研究领域,该数据集为结构化信息提取提供了标准化的基准。其经典应用场景在于从多轮客服对话中自动化抽取行为分析、运营分析和诊断推理信号,例如通过ChatML格式的对话记录,模型能够识别客户意图、努力程度评分及异常诊断类别。这一过程不仅支持对客服交互的深度解析,也为构建智能分析管道奠定了数据基础。
解决学术问题
该数据集针对自然语言处理中的结构化提取难题,提供了明确的解决方案。它通过定义精细的物流客户体验指标,如意图分类、情感轨迹和根本原因分析,帮助研究者克服多标签分类与模式约束生成的挑战。其意义在于推动了对话理解领域向可解释、可审计的推理方向发展,为学术探索提供了可复现的实验框架。
衍生相关工作
围绕该数据集,已衍生出多项经典研究工作,主要集中在模式约束生成与多任务学习框架上。例如,基于其开源项目OmniCX-Extractor,研究者开发了更高效的提取模型,并扩展了物流意图分类的边界。这些工作进一步推动了对话分析技术在真实场景中的落地,为后续的学术与工业应用提供了重要参考。
以上内容由遇见数据集搜集并总结生成



