mtl-dev/semantic-sim-unlabeled-shuffled-batchs-1000-2000
收藏Hugging Face2024-10-15 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/mtl-dev/semantic-sim-unlabeled-shuffled-batchs-1000-2000
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含文本、相似度分数、标签和报告名称四个主要特征。数据集分为训练集、测试集和验证集三个部分,分别包含1600000、200000和200000个样本。数据集的下载大小为212176637字节,总大小为495109940字节。
The dataset contains four main features: text (text, data type is string), similarity_score (similarity score, data type is float64), label (label, data type is sequence of strings), and report_name (report name, data type is string). The dataset is divided into three parts: training set (train), test set (test), and validation set (validation), containing 1600000, 200000, and 200000 samples respectively. The download size of the dataset is 212176637 bytes, and the total size is 495109940 bytes.
提供机构:
mtl-dev



