koen430/random_selected_stock_mix

Name: koen430/random_selected_stock_mix
Creator: koen430
Published: 2024-05-20 10:43:26
License: 暂无描述

Hugging Face2024-05-20 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/koen430/random_selected_stock_mix

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: ticker dtype: string - name: prompt dtype: string - name: text dtype: string - name: url dtype: string - name: result_1 dtype: string - name: result_1_bin dtype: int64 - name: relevance dtype: string - name: token_count dtype: int64 - name: __index_level_0__ dtype: int64 splits: - name: train num_bytes: 12167785 num_examples: 3600 - name: val num_bytes: 708256 num_examples: 200 - name: test num_bytes: 698513 num_examples: 200 download_size: 7170454 dataset_size: 13574554 configs: - config_name: default data_files: - split: train path: data/train-* - split: val path: data/val-* - split: test path: data/test-* --- ## INFO A random selection of news articles and tweets for the purpose of fine-tuning a LLM to predict stock price movement the day after the news publications/tweets. Source koen430/preprocessed_stock_news and koen430/preprocessed_stock_twitter More info will follow soon

提供机构：

koen430

原始信息汇总

数据集概述

数据集特征

ticker: 字符串类型
prompt: 字符串类型
text: 字符串类型
url: 字符串类型
result_1: 字符串类型
result_1_bin: 整数类型（int64）
relevance: 字符串类型
token_count: 整数类型（int64）
index_level_0: 整数类型（int64）

数据集分割

train: 3600个样本，占用12167785字节
val: 200个样本，占用708256字节
test: 200个样本，占用698513字节

数据集大小

下载大小: 7170454字节
数据集总大小: 13574554字节

配置文件

config_name: default
data_files:
- train: 路径为data/train-*
- val: 路径为data/val-*
- test: 路径为data/test-*

5,000+

优质数据集

54 个

任务类型

进入经典数据集