Event Graph of BPI Challenge 2017
收藏Mendeley Data2024-06-25 更新2024-06-29 收录
下载链接:
https://data.4tu.nl/articles/dataset/Event_Graph_of_BPI_Challenge_2017/14169584/1
下载链接
链接失效反馈官方服务:
资源简介:
Business process event data modeled as labeled property graphs Data Format ----------- The dataset comprises one labeled property graph in two different file formats. #1) Neo4j .dump format A neo4j (https://neo4j.com) database dump that contains the entire graph and can be imported into a fresh neo4j database instance using the following command, see also the neo4j documentation: https://neo4j.com/docs/ /bin/neo4j-admin.(bat|sh) load --database=graph.db --from= The .dump was created with Neo4j v3.5. #2) .graphml format A .zip file containing a .graphml file of the entire graph Data Schema ----------- The graph is a labeled property graph over business process event data. Each graph uses the following concepts :Event nodes - each event node describes a discrete event, i.e., an atomic observation described by attribute "Activity" that occurred at the given "timestamp" :Entity nodes - each entity node describes an entity (e.g., an object or a user), it has an EntityType and an identifier (attribute "ID") :Log nodes - describes a collection of events that were recorded together, most graphs only contain one log node :Class nodes - each class node describes a type of observation that has been recorded, e.g., the different types of activities that can be observed, :Class nodes group events into sets of identical observations :CORR relationships - from :Event to :Entity nodes, describes whether an event is correlated to a specific entity; an event can be correlated to multiple entities :DF relationships - "directly-followed by" between two :Event nodes describes which event is directly-followed by which other event; both events in a :DF relationship must be correlated to the same entity node. All :DF relationships form a directed acyclic graph. :HAS relationship - from a :Log to an :Event node, describes which events had been recorded in which event log :OBSERVES relationship - from an :Event to a :Class node, describes to which event class an event belongs, i.e., which activity was observed in the graph :REL relationship - placeholder for any structural relationship between two :Entity nodes The concepts a further defined in Stefan Esser, Dirk Fahland: Multi-Dimensional Event Data in Graph Databases. CoRR abs/2005.14552 (2020) https://arxiv.org/abs/2005.14552 Data Contents ------------- neo4j-bpic17-2021-02-17 (.dump|.graphml.zip) An integrated graph describing the raw event data of the entire BPI Challenge 2017 dataset. van Dongen, B.F. (Boudewijn) (2017): BPI Challenge 2017. 4TU.ResearchData. Collection. https://doi.org/10.4121/uuid:5f3067df-f10b-45da-b98b-86ae4c7a310b This event log pertains to a loan application process of a Dutch financial institute. The data contains all applications filed trough an online system in 2016 and their subsequent events until February 1st 2017, 15:11. The company providing the data and the process under consideration is the same as doi:10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f. However, the system supporting the process has changed in the meantime. In particular, the system now allows for multiple offers per application. These offers can be tracked through their IDs in the log. The data contains the following entities and their events - Application - a credit application document submitted by a customer to a Dutch financial institute - Offer - a loan offer document created by the institute and sent to the customer - Workflow - a logical grouping of activities by the case management system supporting workers at the financial institute to handle applications and offers - Case_R - a user or worker of the financial institute - Case_AO - a derived entity describing the reified relation between an offer and its related application - Case_AW - a derived entity describing the reified relation between the workflow and its related application - Case_WO - a derived entity describing the reified relation between an offer and its related workflow Data Size --------- BPIC17, nodes: 1425995, relationships: 10300197
本数据集将业务流程事件数据建模为**标签属性图(labeled property graph)**。
### 数据格式
本数据集包含一份标签属性图,共两种存储格式:
1. **Neo4j .dump 格式**
该文件为 Neo4j(https://neo4j.com)数据库导出文件,包含完整的图结构,可通过以下命令导入至全新的 Neo4j 数据库实例,具体可参考 Neo4j 官方文档:https://neo4j.com/docs/
bash
bin/neo4j-admin(.bat|.sh) load --database=graph.db --from=
该 .dump 文件由 Neo4j v3.5 版本生成。
2. **.graphml 格式**
该文件为压缩包(.zip),内含完整图结构的 .graphml 文件。
### 数据模式
本图为面向业务流程事件数据的标签属性图,各节点均遵循以下概念定义:
- **事件节点(Event nodes)**:每个事件节点对应一个离散事件,即原子性观测数据,其通过`"Activity"`属性描述事件类型,`"timestamp"`属性记录事件发生时间。
- **实体节点(Entity nodes)**:每个实体节点对应一个实体(如对象或用户),包含`"EntityType"`(实体类型)和`"ID"`(标识符)两个属性。
- **日志节点(Log nodes)**:用于描述一批同步记录的事件集合,多数图仅包含一个日志节点。
- **类别节点(Class nodes)**:每个类别节点对应一类已记录的观测类型,例如可被观测到的各类活动;类别节点将具有相同观测结果的事件划分为一组。
- **CORR 关系**:从事件节点指向实体节点,用于描述事件与特定实体的关联关系;一个事件可关联多个实体。
- **DF 关系**:表示两个事件节点之间的"直接跟随"关系,用于说明某一事件紧随另一事件发生;处于 DF 关系中的两个事件必须关联至同一个实体节点。所有 DF 关系共同构成一个有向无环图。
- **HAS 关系**:从日志节点指向事件节点,用于描述某事件日志中包含哪些事件。
- **OBSERVES 关系**:从事件节点指向类别节点,用于描述某事件所属的事件类别,即该事件观测到的具体活动类型。
- **REL 关系**:占位符关系,用于表示任意两个实体节点之间的结构化关联。
本数据集的概念定义详见论文:Stefan Esser, Dirk Fahland: Multi-Dimensional Event Data in Graph Databases. CoRR abs/2005.14552 (2020) https://arxiv.org/abs/2005.14552
### 数据内容
`neo4j-bpic17-2021-02-17`(.dump|.graphml.zip)
该整合图完整描述了 BPI Challenge 2017 原始数据集的事件数据。
数据集来源:van Dongen, B.F. (Boudewijn) (2017): BPI Challenge 2017. 4TU.ResearchData. 合集. https://doi.org/10.4121/uuid:5f3067df-f10b-45da-b98b-86ae4c7a310b
本事件日志对应荷兰某金融机构的贷款申请流程,包含2016年通过线上系统提交的全部贷款申请,以及截至2017年2月1日15:11的后续所有事件。
提供本数据集与对应流程的机构与文献 doi:10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f 一致,但该流程所使用的系统在此期间已更新。目前系统支持为单个申请提供多份报价,且可通过日志中的ID追踪这些报价。
本数据集包含以下实体及其对应事件:
- **Application(申请)**:客户向荷兰金融机构提交的信贷申请文件
- **Offer(报价)**:金融机构生成并发送给客户的贷款报价文件
- **Workflow(工作流)**:由案件管理系统对活动进行的逻辑分组,用于协助金融机构工作人员处理申请与报价
- **Case_R**:金融机构的用户或工作人员
- **Case_AO**:衍生实体,用于具象化报价与关联申请之间的关系
- **Case_AW**:衍生实体,用于具象化工作流与关联申请之间的关系
- **Case_WO**:衍生实体,用于具象化报价与关联工作流之间的关系
### 数据规模
BPIC17 数据集包含:节点 1425995 个,关系 10300197 条。
创建时间:
2023-06-28



