five

Reddit/Subreddit Dataset

收藏
Snowflake2023-02-15 更新2024-05-01 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZT1Z125KD1
下载链接
链接失效反馈
官方服务:
资源简介:
# Reddit/Subreddit Dataset Reddit is the world's leading forum community for every interest. The dataset tracks attributes of 2.1+ million subreddits and subscriber counts daily since January 2023. Additionally, we leverage AI to append subreddit attributes that allow you to categorize and find subreddits by subject. The dataset is ideal for Interest-Based Segmentation, trend analysis, and audience insights. <p><br/></p> ## ✨ Features - Tracking 2M current and historical subreddit communities - 60M+ records - AI appended attributes tracking subreddit categories, related search terms, and related subjects allowing for finding related subreddit communities - Subscriber counts dating back to January 2023 <p><br/></p> ## 🎟️ Free Trial The free trial is for a period of 7 days and includes subreddits with more than 100,000 followers. <p><br/></p> ## 🗄️ Tables Included in the Dataset - **Subreddit Dimension** - A record per subreddit with many attributes that describe the subreddit - **Fact Subreddit Day** - A record per day subreddit tracking the number of subscribers - **Date Dimension** - Date dimension table (per day) <p><br/></p> ## 💎 High-Value Attributes in the Dataset: - **subreddit_url** - full URL to the subreddit - **description** - Subreddit description - **parsed words** - Words parsed from the subreddit name - **derived categories** - An appended array of high-level categories related to the subreddit - **derived search terms** - An appended array of search terms related to the subreddit - **derived related subjects** - An appended array of subjects related to the subreddit - **subreddit type** - Type of subreddit. Public or Private - **language** - ISO Language Date - **subreddit created date** - Timestamp of when the subreddit was created - **subscriber count** - Total number of subscribers. Tracked per day. - **allows images** - Boolean. True = Allows Images - **allows discovery** - Boolean. True = Allows Discovery - **over18** - Boolean. True = Over 18 subreddit - **whitelist status** - filled with some_ads, no_ads, all_ads - **advertiser category** - filled with Automotive, Business/FinanceCollege/University, • Entertainment, Family&Youth Games, Health, Lifestyles, Local, Retail, Sports, Technology, Travel - **derived categories** 🤖 - AI appended array with categories related to the the subreddi community. - **derived related subjects** 🤖 - AI appended array with subjects related to the subreddit community. - **derived search terms** 🤖 - AI appended search terms related to the subreddit community.
提供机构:
Dataplex Consulting & Data Products
创建时间:
2023-01-31
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集涵盖210多万个Reddit子社区的属性和每日订阅数据,时间跨度自2023年1月起,包含6000多万条记录。通过AI技术添加了分类、关联主题和搜索词等属性,支持兴趣细分、趋势分析和受众洞察研究。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作