five

BaoLocTown/reddit-clustering-p2p-exploded-test-vn

收藏
Hugging Face2024-10-30 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/BaoLocTown/reddit-clustering-p2p-exploded-test-vn
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含10个不同的集群(cluster_0到cluster_9),每个集群的数据集都包含三个特征:sentences(句子)、labels(标签)和og_sentences(原始句子)。数据集仅包含训练集(train),并且每个集群的训练集都有不同的字节数和示例数。具体来说,cluster_0包含2048个示例,cluster_1包含79172个示例,cluster_2包含1942个示例,cluster_3包含13224个示例,cluster_4包含92303个示例,cluster_5包含28607个示例,cluster_6包含69146个示例,cluster_7包含67469个示例,cluster_8包含29683个示例,cluster_9包含62261个示例。

The dataset consists of 10 different clusters (cluster_0 to cluster_9), each containing three features: sentences, labels, and og_sentences. The dataset only includes a training set (train), and each clusters training set has a different number of bytes and examples. Specifically, cluster_0 contains 2048 examples, cluster_1 contains 79172 examples, cluster_2 contains 1942 examples, cluster_3 contains 13224 examples, cluster_4 contains 92303 examples, cluster_5 contains 28607 examples, cluster_6 contains 69146 examples, cluster_7 contains 67469 examples, cluster_8 contains 29683 examples, and cluster_9 contains 62261 examples.
提供机构:
BaoLocTown
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作