five

simonko912/chan-shitpost-2.5

收藏
Hugging Face2026-03-15 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/simonko912/chan-shitpost-2.5
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - text-generation language: - en tags: - chan, - 4chan - shitpost pretty_name: Chan shitpost 2.5 size_categories: - 1M<n<10M --- This dataset was a scrape of `~1.7 million` messages, For smaller scale use version 2 `(~470k)` or 1.5 `(~170k)` A dataset made from 18 diffrent sources List of scraped sites for this version of Chan Shitpost: | Site | Rows | -----------------------------|----------------------------| | a.4cdn.org | 993,096 rows | | lolcow.farm | 385,125 rows | | leftypol.org | 153,127 rows | | lainchan.org | 62,550 rows | | crystal.cafe | 38,943 rows | | news.ycombinator.com | 38,401 rows | | wizchan.org | 26,787 rows | | 39chan.moe | 25,133 rows | | 8kun.top | 15,648 rows | | 4plebs.org | 8,025 rows | | beehaw.org | 7,182 rows | | whitequark-irc | 4,860 rows | | ponychan.co | 2,124 rows | | autismchan.net | 607 rows | | awsumchan.org | 542 rows | | t.me | 354 rows | | mechachan.net | 145 rows | | 112chan.ro | 54 rows |
提供机构:
simonko912
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作