withalim/bluesky-posts
收藏Hugging Face2024-12-01 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/withalim/bluesky-posts
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了从Bluesky Social平台上收集的800万条公开帖子,时间跨度为2024年11月27日至12月1日,未来还将增加1200万条帖子。数据集以JSONL格式存储,每条帖子包含唯一标识符、创建时间戳、帖子内容、作者信息等字段。数据集主要用于社交媒体内容分析、语言处理研究、趋势分析、内容推荐系统和社交网络分析等应用场景。
This dataset contains 8 million public posts collected from the Bluesky Social platform between November 27 and December 1, 2024, with an additional 12 million posts expected in the coming weeks. Each post entry includes a unique identifier, creation timestamp, content, author information, and additional metadata. The dataset is suitable for social media content analysis, language processing research, trend analysis, content recommendation systems, and social network analysis. Files are organized chronologically, with each file approximately 140MB in size, named in the format posts_[DATE]_[TIME].jsonl. The dataset is released under the MIT License.
提供机构:
withalim



