nguyenbase/poseidon
收藏Hugging Face2025-12-12 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/nguyenbase/poseidon
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: timestamp
dtype: timestamp[ns, tz=UTC]
- name: open
dtype: float64
- name: high
dtype: float64
- name: low
dtype: float64
- name: close
dtype: float64
- name: volume
dtype: float64
- name: ticker
dtype: string
splits:
- name: train
num_bytes: 195863846
num_examples: 3538138
download_size: 52097595
dataset_size: 195863846
configs:
- config_name: default
data_files:
- split: train
path: data/ohlcv_*.parquet
---
# 📈 OHLCV-1m: US Stock Market Minute-Level Candlestick Data (1992–2025)
This dataset provides minute-level OHLCV (Open, High, Low, Close, Volume) candlestick data for thousands of U.S. stocks across multiple decades (1992 to 2025). The data was originally sourced from [Finnhub.io](https://finnhub.io), a real-time market data provider.
It has been aggregated and reformatted from monthly `.tar` archives into clean and unified Parquet files — one per month — and uploaded to the Hugging Face Hub for easy access.
## 🧾 Dataset Structure
Each row in the dataset represents **one minute** of trading for a given stock ticker, and includes the following columns:
| Column | Type | Description |
|------------|------------------------------|-------------------------------------|
| `timestamp`| `datetime64[ns, UTC]` | Start time of the minute |
| `open` | `float64` | Opening price |
| `high` | `float64` | Highest price within the minute |
| `low` | `float64` | Lowest price within the minute |
| `close` | `float64` | Closing price |
| `volume` | `float64` | Volume traded within the minute |
| `ticker` | `string` | Stock ticker symbol |
The data is split by month into files like:
data/ohlcv_1992-01.parquet
data/ohlcv_1992-02.parquet
...
data/ohlcv_2025-05.parquet
## 📚 Usage
```python
from datasets import load_dataset
# Load the dataset (will stream across all months)
ds = load_dataset("mito0o852/OHLCV-1m", split="train")
# View one row
print(ds[0])
# To convert it into a pandas DataFrame:
import pandas as pd
df = ds.to_pandas()
print(df.head())
提供机构:
nguyenbase



