StephanAkkerman/fintwit-images
收藏Hugging Face2024-05-10 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/StephanAkkerman/fintwit-images
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: mit
task_categories:
- image-classification
- image-feature-extraction
pretty_name: FinTwit Images
dataset_info:
features:
- name: image
dtype: image
- name: label
dtype:
class_label:
names:
'0': charts
'1': non-charts
- name: id
dtype: string
splits:
- name: train
num_bytes: 320924227.116
num_examples: 4177
download_size: 444164916
dataset_size: 320924227.116
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
tags:
- fintwit
- twitter
- charts
- financial
- financial charts
- finance
- stocks
- crypto
- image
---
## FinTwit Images
This dataset is a collection of a sample of images from tweets that I scraped using my [Discord bot](https://github.com/StephanAkkerman/fintwit-bot) that keeps track of financial influencers on Twitter.
The data consists of images that were part of tweets that did not mention a ticker.
This dataset can be used for a wide variety of tasks, such as image classification or feature extraction.
### FinTwit Charts Collection
This dataset is part of a larger collection of datasets, scraped from Twitter and labeled by a human (me). Below is the list of related datasets.
- [Crypto Charts](huggingface.co/datasets/StephanAkkerman/crypto-charts): Images of financial charts of cryptocurrencies
- [Stock Charts](https://huggingface.co/datasets/StephanAkkerman/stock-charts): Images of financial charts of stocks
- [FinTwit Images](https://huggingface.co/datasets/StephanAkkerman/fintwit-images): Images that had no clear description, this contains a lot of non-chart images
## Dataset Structure
Each images in the dataset is structured as follows:
- **Image**: The image of the tweet, this can be of varying dimensions.
- **Label**: A numerical label indicating the category of the image, with '1' for charts, and '0' for non-charts.
## Dataset Size
The dataset comprises 4,579 images in total, categorized into:
- 1,083 chart images
- 3,496 non-chart images
## Usage
I used this dataset for training my [chart-recognizer model](https://huggingface.co/StephanAkkerman/chart-recognizer) for classifying if an image is a chart or not.
## Acknowledgments
We extend our heartfelt gratitude to all the authors of the original tweets.
## License
This dataset is made available under the MIT license, adhering to the licensing terms of the original datasets.
提供机构:
StephanAkkerman
原始信息汇总
FinTwit Images 数据集概述
基本信息
- 语言: 英语
- 许可证: MIT
- 任务类别: 图像分类, 图像特征提取
- 美观名称: FinTwit Images
数据集特征
- 特征:
- image: 图像数据
- label: 标签数据,类别包括 charts 和 non-charts
- id: 字符串类型
数据集分割
- train:
- 字节数: 320924227.116
- 样本数: 4177
数据集大小
- 下载大小: 444164916
- 数据集大小: 320924227.116
配置
- default:
- 数据文件:
- split: train
- path: data/train-*
- 数据文件:
标签
- tags:
- fintwit
- charts
- financial
- financial charts
- finance
- stocks
- crypto
- image
数据集结构
- Image: 推文图像,尺寸不一
- Label: 数值标签,1 表示 charts,0 表示 non-charts
数据集大小
- 总图像数: 4579
- chart 图像数: 1083
- non-chart 图像数: 3496
使用
- 用于训练 chart-recognizer 模型,用于分类图像是否为 chart。
许可证
- 该数据集在 MIT 许可证下发布。



