CryptoVision: A Comprehensive Dataset for Crypto News and Trend Prediction
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/wvjjxr8bxx
下载链接
链接失效反馈官方服务:
资源简介:
This dataset provides a large-scale collection of 188,431 cryptocurrency news articles published between 2017 and 2025. The articles were gathered from reputable sources, including BlockWorks, Coindesk, Cointelegraph, CryptoPanic,CryptoNews, and Decrypt, using Python-based web scraping pipelines. Each entry contains rich metadata such as the article title, full-text content, publication source, URL, sentiment label, and the percentage change in the corresponding cryptocurrency’s price movement around the publication time.
To enhance research utility, sentiment labels were automatically generated using fine-tuned transformer models (BERT and DistilBERT), with additional manual checks for validation. Articles are categorized into three sentiment classes: positive, negative, and neutral.
This dataset is particularly valuable for machine learning and financial research, including:
1.Training and evaluating sentiment analysis models tailored to financial text.
2.Developing cryptocurrency price prediction models by linking sentiment with historical price and volume data.
3.Investigating the correlation between news sentiment and market trends across multiple cryptocurrencies.
4.Exploring the potential of sentiment-driven trading strategies in highly volatile markets.
The dataset’s scale, diversity, and decade-long coverage make it a unique resource for researchers in computer science, finance, and data science, especially those working on natural language processing (NLP), predictive modeling, and market forecasting.
创建时间:
2025-09-29



