Brain Machine Learning Stock Ranking - Live Feed
收藏Snowflake2022-07-11 更新2024-05-01 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZSVZD5BWT
下载链接
链接失效反馈官方服务:
资源简介:
Brain Machine Learning proprietary platform is exploited to generate a daily stock ranking based on the predicted future returns of a universe of largest 1000 U.S. stocks on five time horizons: 2,3, 5, 10 and 21 trading days. The universe is updated yearly.
The model implements a voting scheme of machine learning classifiers that non linearly combine a variety of features with a series of techniques aimed at mitigating the well-known overfitting problem for financial data with a low signal to noise ratio.
Model inputs include stock specific features such as fundamentals and price-volume related metrics, market data such as volatility and other financial stress indicators, and calendar related signals such as day or month anomalies.
This data set contains historical data from August 2016 and live data updated daily within 4am UTC.
It is important to note that the provided ranking score has a meaning only if used to compare different stocks to perform a ranking. For example a typical use case is to download the stock ranking for a large stock universe for a given day, e.g. 500 stocks or the full universe of 1000 stocks, then order the stocks by ranking score (field "ML_ALPHA", see fields description below or data dictionary) and go long the top K stocks, or build a long-short strategy going long the top K and short the bottom K stocks.
The main schema is called "STOCK_RANKING" and contains five tables, one for each prediction time horizon:
- "STOCK_RANKING_NEXT_DAYS_2" contains the predicted stock rankings based on the predicted future returns for next 2 trading days
- "STOCK_RANKING_NEXT_DAYS_3" contains the predicted stock rankings based on the predicted future returns for next 3 trading days
- "STOCK_RANKING_NEXT_DAYS_5" contains the predicted stock rankings based on the predicted future returns for next 5 trading days
- "STOCK_RANKING_NEXT_DAYS_10" contains the predicted stock rankings based on the predicted future returns for next 10 trading days
- "STOCK_RANKING_NEXT_DAYS_21" contains the predicted stock rankings based on the predicted future returns for next 21 trading days
- "STOCK_UNIVERSE" contains the the stock universe. The stock universe corresponds to the set of stocks for which the system is providing a prediction for the given date. The stock universe is updated annually. In general every day approximately the 98% of the stock universe is covered.
The key fields the stock ranking tables are:
- CALCULATION_DATE: The calculation date for the stock ranking score in format YYYY-MM-DD.
- COMPOSITE_FIGI: The FIGI composite code (https://www.openfigi.com) that uniquely identifies the company stock across related exchanges in US.
- TICKER: The stock ticker.
- ML_ALPHA: Score related to the predicted return on the time horizon of next N trading days, where N = 2, 3, 5, 10, 21 depending on the selected table. More specifically the assigned score ML_ALPHA is related to the confidence of a Machine Learning classifier in assigning the stock to a class 0 (underperforming with respect to the median of the universe in the next N days) or class 1 (overperforming with respect to the median of the universe in the next N days). It is important to note that the ranking score has a meaning only if used to compare different stocks to perform a ranking. A typical use case is to download the stock ranking for a large stock universe for a given day, e.g. 500 stocks or the full universe of 1000 stocks, then order the stocks by ML_ALPHA score and go long the top K stocks, or build a long-short strategy going long the top K and short the bottom K stocks.
DISCLAIMER
The content of this dataset is not to be intended as investment advice. The material is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory or other services by Brain. Brain makes no guarantees regarding the accuracy and completeness of the information expressed in the dataset.
本数据集依托Brain Machine Learning专有平台,基于美国规模最大的1000只股票池在2、3、5、10、21个交易日共5种时间维度下的未来预期收益率,生成每日股票排名。该股票池每年更新一次。
该模型采用机器学习分类器投票机制,通过一系列技术非线性融合多类特征,旨在缓解金融数据中信噪比低这一广为人知的过拟合(overfitting)问题。
模型输入涵盖个股特征(如基本面、量价相关指标)、市场数据(如波动率及其他金融压力指标),以及日历相关信号(如交易日或月度异常效应)。
本数据集包含2016年8月起的历史数据,以及每日在协调世界时(UTC)凌晨4点前更新的实时数据。
需特别说明:本次提供的排名分数仅在用于对比不同股票以完成排名时具备意义。典型应用场景为:在指定日期下载大型股票池的排名数据(例如500只股票或完整的1000只股票池),按排名分数(字段为ML_ALPHA,详见下方字段说明或数据字典)对股票进行排序,做多排名靠前的K只股票,或构建多空策略——做多排名靠前的K只股票,同时做空排名靠后的K只股票。
数据集主架构名为`STOCK_RANKING`,包含5张数据表,分别对应5种预测时间维度:
- `STOCK_RANKING_NEXT_DAYS_2`:基于未来2个交易日的预期收益率生成的股票排名表
- `STOCK_RANKING_NEXT_DAYS_3`:基于未来3个交易日的预期收益率生成的股票排名表
- `STOCK_RANKING_NEXT_DAYS_5`:基于未来5个交易日的预期收益率生成的股票排名表
- `STOCK_RANKING_NEXT_DAYS_10`:基于未来10个交易日的预期收益率生成的股票排名表
- `STOCK_RANKING_NEXT_DAYS_21`:基于未来21个交易日的预期收益率生成的股票排名表
- `STOCK_UNIVERSE`:股票池数据表。该股票池对应系统在指定日期提供预测的股票集合,每年更新一次。通常情况下,每日可覆盖约98%的股票池成分。
股票排名数据表的核心字段包括:
- CALCULATION_DATE:股票排名分数的计算日期,格式为YYYY-MM-DD。
- COMPOSITE_FIGI:FIGI复合代码(https://www.openfigi.com),可唯一标识美国相关交易所上市的公司股票。
- TICKER:股票代码。
- ML_ALPHA:对应未来N个交易日(N=2、3、5、10、21,具体取决于所选数据表)的预期收益率相关分数。更具体而言,ML_ALPHA分数反映了机器学习分类器将股票归类为0类(未来N个交易日内跑输股票池中位数)或1类(未来N个交易日内跑赢股票池中位数)的置信度。需再次强调:该排名分数仅在用于对比不同股票以完成排名时具备意义。典型应用场景为:在指定日期下载大型股票池的排名数据(例如500只股票或完整的1000只股票池),按ML_ALPHA分数对股票进行排序,做多排名靠前的K只股票,或构建多空策略——做多排名靠前的K只股票,同时做空排名靠后的K只股票。
免责声明
本数据集内容不应被视为投资建议。本材料仅用于提供信息参考,不构成任何证券或策略的出售邀约、买入招揽、推荐或背书,亦不构成Brain提供投资咨询或其他服务的邀约。Brain不对数据集中信息的准确性与完整性作出任何保证。
提供机构:
Brain
创建时间:
2022-07-11



