five

RSS3-Network/high_quality_open_web_content

收藏
Hugging Face2024-10-09 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/RSS3-Network/high_quality_open_web_content
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集由RSS3 Network整理,包含来自多个去中心化平台(如Farcaster和Lens)的大量内容。数据集经过RSS3 Nodes的结构化和索引处理,便于访问和分析。RSS3生态系统项目已利用该数据集构建各种应用和服务,包括微调大型语言模型、训练推荐系统等。数据集包含以下字段:作者在对应平台上的句柄(handle)、帖子文本内容(body)、与帖子相关的媒体对象列表(media,包含媒体地址和MIME类型)、作者在对应平台上的个人资料ID(profile_id)、帖子在对应平台上的发布ID(publication_id)以及帖子的时间戳(timestamp)。数据集采用CC0 1.0许可证,允许用户自由使用和分发。

This dataset is curated by the RSS3 Network and contains a large collection of content from various decentralized platforms such as Farcaster and Lens. The dataset has been structured and indexed by RSS3 Nodes and is provided in a structured format for easy access and analysis. The features of the dataset include the authors handle, post content, associated media objects (including media address and MIME type), authors profile ID, posts publication ID, and post timestamp. The dataset is divided into multiple batches, each containing a certain number of examples and bytes. The dataset is licensed under CC0 1.0, allowing users to distribute, remix, adapt, and build upon the material in any medium or format, even for commercial purposes.
提供机构:
RSS3-Network
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作