five

mlfoundations-dev/oh_v1.2_sin_alpaca_diversity

收藏
Hugging Face2024-12-01 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/mlfoundations-dev/oh_v1.2_sin_alpaca_diversity
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含多个特征,如对话内容、分片ID、输出、ngram_3唯一性、熵、基尼指数、自BLEU、嵌入、kmeans惯性嵌入、归一化kmeans惯性嵌入、投影梯度嵌入、新对话、投影梯度、投影梯度Vendi、投影梯度对数行列式、投影嵌入对数行列式、kmeans惯性梯度、归一化kmeans惯性梯度等。数据集分为训练集,包含779,144个示例,总大小为12,284,670,540字节,下载大小为7,423,595,874字节。配置信息指定了默认配置下的数据文件路径。

The dataset includes multiple features such as conversations, shard ID, output, ngram_3 uniqueness, entropy, gini index, self-BLEU, embeddings, kmeans inertia embeddings, normalized kmeans inertia embeddings, projected gradients embeddings, new conversations, projected gradients, projected gradients Vendi, projected gradients log determinant, projected embeddings log determinant, kmeans inertia gradients, normalized kmeans inertia gradients, etc. The dataset is divided into a training set containing 779,144 examples, with a total size of 12,284,670,540 bytes and a download size of 7,423,595,874 bytes. The configuration information specifies the data file paths under the default configuration.
提供机构:
mlfoundations-dev
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作