BaoLocTown/reddit-clustering-p2p-exploded-test-vn
收藏Hugging Face2024-10-30 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/BaoLocTown/reddit-clustering-p2p-exploded-test-vn
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含10个不同的集群(cluster_0到cluster_9),每个集群的数据集都包含三个特征:sentences(句子)、labels(标签)和og_sentences(原始句子)。数据集仅包含训练集(train),并且每个集群的训练集都有不同的字节数和示例数。具体来说,cluster_0包含2048个示例,cluster_1包含79172个示例,cluster_2包含1942个示例,cluster_3包含13224个示例,cluster_4包含92303个示例,cluster_5包含28607个示例,cluster_6包含69146个示例,cluster_7包含67469个示例,cluster_8包含29683个示例,cluster_9包含62261个示例。
The dataset consists of 10 different clusters (cluster_0 to cluster_9), each containing three features: sentences, labels, and og_sentences. The dataset only includes a training set (train), and each clusters training set has a different number of bytes and examples. Specifically, cluster_0 contains 2048 examples, cluster_1 contains 79172 examples, cluster_2 contains 1942 examples, cluster_3 contains 13224 examples, cluster_4 contains 92303 examples, cluster_5 contains 28607 examples, cluster_6 contains 69146 examples, cluster_7 contains 67469 examples, cluster_8 contains 29683 examples, and cluster_9 contains 62261 examples.
提供机构:
BaoLocTown



