heyyjudes/wildchat-en-low-sim-cluster
收藏Hugging Face2025-04-02 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/heyyjudes/wildchat-en-low-sim-cluster
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含三个字段:prompt(提示文本)、embedding(文本的嵌入表示)、cluster_name(聚类名称)。训练集包含了219,134个示例,数据集的总大小为约1.17GB。数据集主要用于文本聚类任务,其中prompt字段可能包含用于聚类的文本,embedding字段则是文本的向量表示,cluster_name字段可能是用于标记每个文本所属的聚类。具体的应用场景和详细数据集内容未在README中描述。
The dataset includes three fields: prompt (text prompt), embedding (text embedding representation), and cluster_name (cluster name). The training set contains 219,134 examples, and the total size of the dataset is approximately 1.17GB. The dataset is primarily used for text clustering tasks, where the prompt field may contain text for clustering, the embedding field is the vector representation of the text, and the cluster_name field may be used to label the cluster to which each text belongs. The specific application scenario and detailed dataset content are not described in the README.
提供机构:
heyyjudes



