Help Desk Tickets

Mendeley Data2026-04-18 收录

下载链接：

https://data.mendeley.com/datasets/btm76zndnt

下载链接

链接失效反馈

官方服务：

资源简介：

These datasets were created as part of a study involving an experiment with a helpdesk team at an international software company. The goal was to implement an automated performance appraisal model that evaluates the team based on issue reports and key features derived from classifying message exchanged with the customers using Dialog Acts. The data was extracted from a PostgreSQL database and curated to present aggregated views of helpdesk tickets reported between January 2016 and March 2023. Certain fields have been anonymized (masked) to protect the data owner’s privacy while preserving the overall meaning of the information. The datasets are: - issues.csv Dataset holds information for all reported tickets, showing its category, priority, who reported the issue, related project, who was assigned to resolve that ticket, the start time, the resolution time, and how many seconds the ticket spent in each resolution step. - issues_change_history.csv Shows when the ticket assignee and status were changed. This dataset helps calculate the time spent on each step. - issues_snapshots.csv Contains the same records in the issues.csv but duplicates the tickets that multiple assignees handled; each record is the processing cycle per assignee. - scored_issues_snapshot_sample.xlsx A stratified and representative sample extracted from the tickets and then handed to an annotator (the help-desk manager) to appraise the resolution performance against three targets, where 5 is the highest score and 1 is the lowest. - sample_utterances.csv Contains the messages (comments) that were exchanged between the reporters and the helpdesk team. This dataset only contains the curated messages for the issues listed in scored_issues_snapshot_sample.xlsx, as those were the focus of the initial study. The following files are guidelines on how to work and interpret the datasets: - FEATURES.md Describes the datasets features (fields). - EXAMPLE.md Shows an example of an issue in all datasets so the reader can understand the relations between them. - process-flow.png A demonstration of the steps followed by the helpdesk team to resolve an issue. These datasets are valuable for many other experiments such like: - Count Predictions - Regression - Association rules mining - Natural Language Processing - Classification - Clustering

本数据集源自一项针对某国际软件公司客服团队的实验研究，旨在构建自动化绩效评估模型，该模型依托工单报告与通过对话行为（Dialog Acts）分类客户交互消息所提取的关键特征，对团队绩效进行评估。数据从PostgreSQL数据库中提取，并经整理形成2016年1月至2023年3月间上报的客服工单聚合视图。部分字段已做匿名化（掩码）处理，在保护数据所有者隐私的同时保留信息的整体含义。本数据集包含以下文件： - issues.csv：存储所有上报工单的相关信息，包含工单类别、优先级、上报人、关联项目、负责解决该工单的人员、工单起始时间、解决总时长，以及工单在每个解决步骤中花费的秒数。 - issues_change_history.csv：记录工单负责人与工单状态的变更节点，可用于计算各处理步骤的耗时。 - issues_snapshots.csv：与issues.csv包含相同的工单记录，但对存在多位负责人的工单进行了复制，每条记录对应一位负责人的完整处理周期。 - scored_issues_snapshot_sample.xlsx：从工单库中按分层抽样提取的代表性样本，已交由标注人员（客服经理）依据三项评估指标对工单解决绩效进行打分，分值区间为1至5，5分为最高分。 - sample_utterances.csv：包含上报人与客服团队之间的交互消息（评论），本数据集仅涵盖scored_issues_snapshot_sample.xlsx中所列工单的经整理后的消息，契合本初始研究的聚焦范围。以下为辅助文档，用于指导数据集的使用与解读： - FEATURES.md：说明各数据集的字段（特征）信息。 - EXAMPLE.md：展示所有数据集中的一条工单示例，便于读者理解各数据集间的关联关系。 - process-flow.png：演示客服团队解决工单的完整流程步骤。本数据集可广泛应用于各类后续实验场景，例如： - 计数预测（Count Predictions） - 回归分析（Regression） - 关联规则挖掘（Association Rules Mining） - 自然语言处理（Natural Language Processing） - 分类任务（Classification） - 聚类分析（Clustering）

创建时间：

2025-05-30

5,000+

优质数据集

54 个

任务类型

进入经典数据集