mlfoundations-dev/oh_v1.2_sin_evol_instruct_diversity
收藏Hugging Face2024-11-27 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/mlfoundations-dev/oh_v1.2_sin_evol_instruct_diversity
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个字段,如对话内容(conversations)、分片ID(shard_id)、输出(output)、3-gram唯一性(ngram_3_uniqueness)、熵(entropy)、基尼指数(gini_index)、自BLEU(self_bleu)、嵌入(embeddings)和K均值惯性嵌入(kmeans_inertia_embeddings)。数据集被分为训练集,包含831,686个样本,总大小为12,646,458,683字节。
The dataset includes multiple fields such as conversations, shard_id, output, ngram_3_uniqueness, entropy, gini_index, self_bleu, embeddings, and kmeans_inertia_embeddings. It is divided into a training set containing 831,686 examples with a total size of 12,646,458,683 bytes.
提供机构:
mlfoundations-dev



