gsjcm/x_dataset_28
收藏Hugging Face2025-10-12 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/gsjcm/x_dataset_28
下载链接
链接失效反馈官方服务:
资源简介:
Bittensor Subnet 13 X (Twitter)数据集是Bittensor Subnet 13去中心化网络的一部分,包含来自X(前Twitter)的预处理数据。数据由网络矿工持续更新,为各种分析和机器学习任务提供实时推文流。数据集支持多种任务,如情感分析、趋势检测、内容分析和用户行为建模。数据主要是英文,但由于其去中心化的创建方式,也可以是多种语言的。数据集结构包括文本、标签、推文标签、日期时间、用户名编码和URL编码字段。该数据集不断更新,没有固定的分割,用户需要根据其需求和数据的时间戳创建自己的分割。数据来自X上的公共推文,遵守平台的条款和服务以及API使用指南。所有用户名和URL都进行编码以保护用户隐私。数据集在MIT许可下发布,并受X使用条款的约束。
The Bittensor Subnet 13 X (Twitter) dataset is part of the Bittensor Subnet 13 decentralized network, containing preprocessed data from X (formerly Twitter). The data is continuously updated by network miners, providing a real-time stream of tweets for various analytical and machine learning tasks. The dataset supports tasks such as sentiment analysis, trend detection, content analysis, and user behavior modeling. The data is primarily in English but can be multilingual due to its decentralized creation. The dataset structure includes fields such as text, label, tweet_hashtags, datetime, username_encoded, and url_encoded. It is continuously updated and does not have fixed splits, so users need to create their own splits based on their requirements and the datas timestamp. The dataset is collected from public tweets on X, adhering to the platforms terms of service and API usage guidelines. All usernames and URLs are encoded to protect user privacy. The dataset is released under the MIT license and is subject to X Terms of Use.
提供机构:
gsjcm



