Data for article "A Machine Learning Approach to Rank Pricing Problems in Branch-and-Price"

Mendeley Data2026-04-18 收录

下载链接：

https://data.mendeley.com/datasets/4wgx2mprks

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset accompanies the research paper titled "A Machine Learning Approach to Rank Pricing Problems in Branch-and-Price." It features a machine learning-based ranker designed to enhance the column generation process by guiding the search for new columns. The application of this ranker is evaluated within the context of operating room scheduling. The dataset is splitted into two directories: "train" and "evaluate." The "train" directory contains the data used for training the machine learning models as detailed in the publication. This directory includes a dataset description and a CSV file capturing the recorded features. Additionally, it houses an 'instance' directory subdivided into four subdirectories (r1_s2_d10, r1_s2_d5, r1_s4_d10, r1_s4_d5), each containing 30 instances. These instances are characterized by the number of patients (10, 20, 30), the number of days (5, 10), and the number of surgeons (2, 4), culminating in a total of 120 instances across all subdirectories. The "evaluate" directory is structured to validate the methodologies developed in the research. It includes an 'input' folder with two subfolders (r1_s4_d5, r1_s4_d10), each containing 9 instances. These instances are characterized by the number of patients (30, 40, 50), the number of days (5, 10), and the number of surgeons (4), culminating in a total of 18 instances across all subdirectories. These instances are utilized to assess and validate the strategies delineated in the paper. The 'output' folder within the directory documents the results for each strategy (best and random branch-and-price, ML strategy, and ILP strategy) with 1, 3, 5, 10, 15, 20 patterns added. Results for each input instance are aggregated here, including CSV files of results, PDFs of branching trees, and output log files. All instances are stored as JSON files and represent synthetically generated data that simulate real-life hospital environments. The methodology employed to generate the data is elaborated in the corresponding paper.

本数据集配套于题为《分支定价算法下排序定价问题的机器学习求解方法》的研究论文。数据集包含一款基于机器学习的排序器，旨在通过引导新列的搜索流程优化列生成算法。该排序器的应用效果在手术室调度场景中完成了评估。数据集分为训练（train）目录与评估（evaluate）目录两个部分。训练（train）目录包含论文中详细阐述的机器学习模型训练所用数据。该目录内含一份数据集说明文档，以及一份记录各类特征的逗号分隔值（CSV）文件。此外，目录下设有实例（instance）子目录，其下又划分为四个子文件夹：r1_s2_d10、r1_s2_d5、r1_s4_d10与r1_s4_d5，每个子文件夹包含30个实例。这些实例由三类参数定义：患者数量（10、20、30）、调度天数（5、10）以及外科医生数量（2、4），所有子文件夹总计包含120个实例。评估（evaluate）目录用于验证本研究提出的方法学体系。其下设有输入（input）文件夹，包含两个子文件夹：r1_s4_d5与r1_s4_d10，每个子文件夹包含9个实例。此类实例的参数为：患者数量（30、40、50）、调度天数（5、10）以及外科医生数量（4），所有子文件夹总计包含18个实例，用于评估与验证论文中阐述的各类策略。该目录下的输出（output）文件夹记录了各策略的运行结果：包括最优分支定价算法、随机分支定价算法、机器学习策略以及整数线性规划（Integer Linear Programming, ILP）策略，且分别配置了1、3、5、10、15、20种模式的测试场景。每个输入实例的结果均在此汇总，包含结果CSV文件、分支树PDF文档以及输出日志文件。所有实例均以JSON文件格式存储，为模拟真实医院运营环境的合成数据。数据生成方法的细节已在对应研究论文中完整阐述。

创建时间：

2024-04-19

5,000+

优质数据集

54 个

任务类型

进入经典数据集