edarchimbaud/short-interest-stocks
收藏Hugging Face2023-11-11 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/edarchimbaud/short-interest-stocks
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: mit
task_categories:
- tabular-regression
dataset_info:
features:
- name: symbol
dtype: string
- name: date
dtype: string
- name: id
dtype: int64
- name: settlement_date
dtype: timestamp[ns]
- name: interest
dtype: float64
- name: avg_daily_share_volume
dtype: float64
- name: days_to_cover
dtype: float64
splits:
- name: train
num_bytes: 8920052
num_examples: 143902
download_size: 1015695
dataset_size: 8920052
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
# Dataset Card for "short-interest-sp500"
## Table of Contents
- [Table of Contents](#table-of-contents)
- [Dataset Description](#dataset-description)
- [Dataset Summary](#dataset-summary)
- [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards)
- [Languages](#languages)
- [Dataset Structure](#dataset-structure)
- [Data Instances](#data-instances)
- [Data Fields](#data-fields)
- [Data Splits](#data-splits)
- [Dataset Creation](#dataset-creation)
- [Curation Rationale](#curation-rationale)
- [Source Data](#source-data)
- [Annotations](#annotations)
- [Personal and Sensitive Information](#personal-and-sensitive-information)
- [Considerations for Using the Data](#considerations-for-using-the-data)
- [Social Impact of Dataset](#social-impact-of-dataset)
- [Discussion of Biases](#discussion-of-biases)
- [Other Known Limitations](#other-known-limitations)
- [Additional Information](#additional-information)
- [Dataset Curators](#dataset-curators)
- [Licensing Information](#licensing-information)
- [Citation Information](#citation-information)
- [Contributions](#contributions)
## Dataset Description
- **Homepage:** https://edarchimbaud.substack.com
- **Repository:** https://github.com/edarchimbaud
- **Point of Contact:** contact@edarchimbaud.com
### Dataset Summary
The short-interest-sp500 dataset provides short interest data for companies listed on the S&P 500 index. This includes the number of shares that have been sold short but have not yet been covered or closed out.
### Supported Tasks and Leaderboards
[N/A]
### Languages
[N/A]
## Dataset Structure
### Data Instances
[N/A]
### Data Fields
- symbol (string): A string representing the ticker symbol or abbreviation used to identify the company.
- date (string): A string representing the date when the data was collected.
- id (int64): A unique integer identifier for each data instance.
- settlement_date (timestamp[ns]): The date by which a buyer must pay for the securities delivered by the seller.
- interest (float64): A floating point number representing the short interest of the company on the specified date.
- avg_daily_share_volume (float64): A floating point number representing the average daily trading volume of the company.
- days_to_cover (float64): A floating point number representing the days to cover metric, which is the number of days volume worth of short interest.
### Data Splits
[N/A]
## Dataset Creation
### Curation Rationale
The short-interest-sp500 dataset was created to facilitate the study of market dynamics, particularly the role of short selling.
### Source Data
#### Initial Data Collection and Normalization
The dataset was compiled from publicly available sources.
### Annotations
#### Annotation process
[N/A]
#### Who are the annotators?
[N/A]
### Personal and Sensitive Information
[N/A]
## Considerations for Using the Data
### Social Impact of Dataset
[N/A]
### Discussion of Biases
[N/A]
### Other Known Limitations
[N/A]
## Additional Information
### Dataset Curators
The short-interest-sp500 dataset was collected by https://edarchimbaud.substack.com.
### Licensing Information
The short-interest-sp500 dataset is licensed under the MIT License.
### Citation Information
> https://edarchimbaud.substack.com, short-interest-sp500 dataset, GitHub repository, https://github.com/edarchimbaud
### Contributions
Thanks to [@edarchimbaud](https://github.com/edarchimbaud) for adding this dataset.
提供机构:
edarchimbaud
原始信息汇总
数据集概述
数据集名称
"short-interest-sp500"
数据集描述
该数据集提供了S&P 500指数中公司的短期兴趣数据,包括已卖空但尚未覆盖或关闭的股票数量。
数据集结构
数据字段
- symbol (字符串): 公司股票代码或缩写。
- date (字符串): 数据收集日期。
- id (整数): 每个数据实例的唯一整数标识。
- settlement_date (时间戳[ns]): 买方必须支付卖方交付的证券的日期。
- interest (浮点数): 指定日期公司的短期兴趣。
- avg_daily_share_volume (浮点数): 公司的平均每日交易量。
- days_to_cover (浮点数): 短期兴趣的天数覆盖指标。
数据分割
- train (训练集): 包含143902个实例,总大小为8920052字节。
数据集创建
采集理由
该数据集旨在促进市场动态研究,特别是卖空的作用。
源数据
数据集从公开可用来源编译。
许可证信息
该数据集根据MIT许可证授权。
引用信息
https://edarchimbaud.substack.com, short-interest-sp500 dataset, GitHub repository, https://github.com/edarchimbaud



