【我遇到的问题】 • 现象:该数据集的下载链接已失效 【相关信息】 • 可考虑访问这个链接获取类似文件~https://www.selectdataset.com/dataset/3688356173feccbcf1f1e490ddc6bc72
3W Dataset
收藏3W 数据集概述
数据集描述
3W 数据集是首个公开的、包含石油井中罕见不良实际事件的真实数据集,可作为开发与实际数据固有困难相关的机器学习技术的基准数据集。该数据集由8种不良事件的实例组成,涉及8个过程变量,经过专家验证的历史实例以及模拟和手绘实例。
数据集结构
3W 数据集包含1,984个CSV文件,存储在7z格式的文件中,位于data目录下。每个文件代表一个实例,文件名揭示其来源。数据格式为每行一个观测值,每列一个系列,列之间用逗号分隔,小数点用点表示。首列为时间戳,末列为观测标签,其余列为多变量时间序列数据。
引用信息
使用3W数据集时,应引用以下文献:
@article{VARGAS2019106223, title = "A realistic and public dataset with rare undesirable real events in oil wells", journal = "Journal of Petroleum Science and Engineering", volume = "181", pages = "106223", year = "2019", issn = "0920-4105", doi = "https://doi.org/10.1016/j.petrol.2019.106223", url = "http://www.sciencedirect.com/science/article/pii/S0920410519306357", author = "Ricardo Emanuel Vaz Vargas and Celso José Munaro and Patrick Marques Ciarelli and André Gonçalves Medeiros and Bruno Guberfain do Amaral and Daniel Centurion Barrionuevo and Jean Carlos Dias de Araújo and Jorge Lins Ribeiro and Lucas Pierezan Magalhães", keywords = "Fault detection and diagnosis, Oil well monitoring, Abnormal event management, Multivariate time series classification", abstract = "Detection of undesirable events in oil and gas wells can help prevent production losses, environmental accidents, and human casualties and reduce maintenance costs. The scarcity of measurements in such processes is a drawback due to the low reliability of instrumentation in such hostile environments. Another issue is the absence of adequately structured data related to events that should be detected. To contribute to providing a priori knowledge about undesirable events for diagnostic algorithms in offshore naturally flowing wells, this work presents an original and valuable dataset with instances of eight types of undesirable events characterized by eight process variables. Many hours of expert work were required to validate historical instances and to produce simulated and hand-drawn instances that can be useful to distinguish normal and abnormal actual events under different operating conditions. The choices made during this datasets preparation are described and justified, and specific benchmarks that practitioners and researchers can use together with the published dataset are defined. This work has resulted in two relevant contributions. A challenging public dataset that can be used as a benchmark for the development of (i) machine learning techniques related to inherent difficulties of actual data, and (ii) methods for specific tasks associated with detecting and diagnosing undesirable events in offshore naturally flowing oil and gas wells. The other contribution is the proposal of the defined benchmarks." }
数据集使用
数据集提供了一些基准实验的结果,包括:
- 基准1:使用模拟和手绘实例的影响(代码和结果链接)
- 基准2:异常检测(代码和结果链接)
这些结果可作为研究人员和实践者的基准参考。




