Dataset for Mid-Price Forecasting of Limit Order Book Data
收藏OpenDataLab2026-05-24 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/Dataset_for_Mid-Price_etc
下载链接
链接失效反馈官方服务:
资源简介:
在这里,我们将标准化数据集提供为 .txt 文件。数据集分为两大类:包含拍卖期的数据集和不包含拍卖期的数据集。对于这两个类别中的每一个,我们都提供了三个基于 z-score、min-max 和小数精度归一化的归一化设置。由于我们对 5 只股票采用了 10 天的锚定交叉验证方法,因此用户可以为每个标准化设置找到九个(交叉折叠)数据集以进行训练和测试。每个训练和测试数据集都包含所有股票的信息。例如,第一折包含所有五只股票的一天训练和一天测试。第二折包含两天的训练数据集和一天的测试数据集。训练数据集的两天信息是从第一折开始的训练和测试,依此类推。
.txt 文件的标题包含以下顺序的信息:
训练或测试集
有或没有拍卖期
标准化设置的类型
基于上述交叉验证方法的折叠数(从 1 到 9)
Here, we make the standardized datasets available as .txt files. The datasets are divided into two major categories: datasets with auction periods and datasets without auction periods. For each of these two categories, we provide three normalization settings based on z-score, min-max, and decimal scaling normalization respectively.
Since we adopted the 10-day anchored cross-validation method for 5 stocks, users can find nine (cross-fold) datasets for training and testing under each normalization setting. Each training and test dataset contains information for all five stocks. For example, Fold 1 includes one day of training data and one day of test data for all five stocks. Fold 2 includes two days of training data and one day of test data, where the two days of training information are sourced from the training and test splits of Fold 1, and so forth.
The filenames of the .txt files contain information in the following order:
1. Training or test set
2. With or without auction periods
3. Type of normalization setting
4. Fold number (from 1 to 9) based on the aforementioned cross-validation method
提供机构:
OpenDataLab
创建时间:
2022-08-11
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集用于限价订单簿的中价预测,包含拍卖期和无拍卖期两类数据,每种提供三种归一化设置,并通过10天锚定交叉验证方法生成九个训练和测试折叠。数据集以.txt文件形式提供,涵盖五只股票的信息,适用于机器学习方法。
以上内容由遇见数据集搜集并总结生成



