DJIA 30 Stock Time Series
收藏www.kaggle.com2018-01-03 更新2025-01-16 收录
下载链接:
https://www.kaggle.com/szrlee/stock-time-series-20050101-to-20171231
下载链接
链接失效反馈官方服务:
资源简介:
### Context
The script used to acquire all of the following data can be found [in this GitHub repository][1]. This repository also contains the modeling codes and will be updated continually, so welcome starring or watching!
Stock market data can be interesting to analyze and as a further incentive, strong predictive models can have large financial payoff. The amount of financial data on the web is seemingly endless. A large and well structured dataset on a wide array of companies can be hard to come by. Here provided a dataset with historical stock prices (last 12 years) for 29 of 30 DJIA companies (excluding 'V' because it does not have the whole 12 years data).
['MMM', 'AXP', 'AAPL', 'BA', 'CAT', 'CVX', 'CSCO', 'KO', 'DIS', 'XOM', 'GE',
'GS', 'HD', 'IBM', 'INTC', 'JNJ', 'JPM', 'MCD', 'MRK', 'MSFT', 'NKE', 'PFE',
'PG', 'TRV', 'UTX', 'UNH', 'VZ', 'WMT', 'GOOGL', 'AMZN', 'AABA']
In the future if you wish for a more up to date dataset, this can be used to acquire new versions of the .csv files.
### Content
The data is presented in a couple of formats to suit different individual's needs or computational limitations.
I have included files containing 13 years of stock data (in the all_stocks_2006-01-01_to_2018-01-01.csv and corresponding folder) and
a smaller version of the dataset (all_stocks_2017-01-01_to_2018-01-01.csv) with only the past year's stock data for those wishing to use something more manageable in size.
The folder individual_stocks_2006-01-01_to_2018-01-01 contains files of data for individual stocks, labelled by their stock ticker name.
The all_stocks_2006-01-01_to_2018-01-01.csv and all_stocks_2017-01-01_to_2018-01-01.csv contain this same data, presented in merged .csv files.
Depending on the intended use (graphing, modelling etc.) the user may prefer one of these given formats.
All the files have the following columns:
Date - in format: yy-mm-dd
Open - price of the stock at market open (this is NYSE data so all in USD)
High - Highest price reached in the day
Low Close - Lowest price reached in the day
Volume - Number of shares traded
Name - the stock's ticker name
### Inspiration
This dataset lends itself to a some very interesting visualizations. One can look at simple things like how prices change over time, graph an compare multiple stocks at once, or generate and graph new metrics from the data provided.
From these data informative stock stats such as volatility and moving averages can be easily calculated.
The million dollar question is: can you develop a model that can beat the market and allow you to make statistically informed trades!
### Acknowledgement
This Data description is adapted from the dataset named 'S&P 500 Stock data'.
This data is scrapped from Google finance using the python library 'pandas_datareader'. Special thanks to Kaggle, Github and the Market.
[1]: https://github.com/szrlee/Stock-Time-Series-Analysis/blob/master/data_collection.ipynb
### 背景信息
用于获取以下所有数据的脚本可在此GitHub仓库[1]中找到。该仓库还包含建模代码,并将持续更新,因此欢迎您关注或监视!
股票市场数据值得分析,并且作为进一步的动力,强大的预测模型可以带来巨大的金融回报。网络上金融数据的数量似乎无穷无尽。一个大型且结构良好的涵盖众多公司的数据集难以获得。以下提供了一个包含29家(30家中的29家)道琼斯工业平均指数公司历史股价(过去12年)的数据集(不包括'V',因为它没有完整的12年数据)。
['MMM', 'AXP', 'AAPL', 'BA', 'CAT', 'CVX', 'CSCO', 'KO', 'DIS', 'XOM', 'GE', 'GS', 'HD', 'IBM', 'INTC', 'JNJ', 'JPM', 'MCD', 'MRK', 'MSFT', 'NKE', 'PFE', 'PG', 'TRV', 'UTX', 'UNH', 'VZ', 'WMT', 'GOOGL', 'AMZN', 'AABA']
未来,如果您需要更更新的数据集,可以使用此方法获取新的.csv文件版本。
### 数据内容
数据以多种格式呈现,以满足不同个人的需求或计算限制。
我包括了包含13年股票数据(在all_stocks_2006-01-01_to_2018-01-01.csv及其对应文件夹中)和较小版本的数据集(all_stocks_2017-01-01_to_2018-01-01.csv),后者仅包含过去一年的股票数据,以便于那些希望使用更易于管理的数据集的人。
文件夹individual_stocks_2006-01-01_to_2018-01-01包含单个股票的数据文件,以股票代码进行标记。
all_stocks_2006-01-01_to_2018-01-01.csv和all_stocks_2017-01-01_to_2018-01-01.csv包含相同的数据,以合并的.csv文件形式呈现。
根据预期的用途(绘图、建模等),用户可能更喜欢这些格式之一。
所有文件都具有以下列:
日期 - 格式为:yy-mm-dd
开盘价 - 市场开盘时的股价(由于这是NYSE数据,因此所有价格均为美元)
最高价 - 当日达到的最高价
最低价 - 当日达到的最低价
收盘价 - 当日收盘价
成交量 - 交易股票的数量
名称 - 股票的代码名称
### 灵感来源
此数据集适合进行一些非常有趣的视觉呈现。可以研究诸如价格随时间变化等简单事物,同时绘制并比较多种股票,或从提供的数据中生成并绘制新的指标。
从这些数据中可以轻松计算出具有信息量的股票统计指标,如波动性和移动平均数。
千禧元问题:您能否开发出一个能够战胜市场并允许您进行基于统计信息的交易的模型?
### 致谢
本数据描述改编自名为'S&P 500 Stock data'的数据集。
此数据使用python库'pandas_datareader'从Google finance抓取。特别感谢Kaggle、Github和Market。
[1]: https://github.com/szrlee/Stock-Time-Series-Analysis/blob/master/data_collection.ipynb
提供机构:
Kaggle



