Mixture-of-tokenizers/Tokenizers-Metrics
收藏Hugging Face2025-04-03 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Mixture-of-tokenizers/Tokenizers-Metrics
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含两个配置:entrophy和fertility。每个配置使用了多种预训练语言模型,如GPT-2、BERT、XLM-Roberta、BLOOM等,数据类型为float64。数据集包含训练集分割,每个分割有18个示例,大小为2304字节,下载大小为10275字节。
The dataset consists of two configurations: entrophy and fertility. Each configuration utilizes multiple pre-trained language models such as GPT-2, BERT, XLM-Roberta, BLOOM, etc., with data type float64. The dataset includes a training set split, each with 18 examples, sized at 2304 bytes, and a download size of 10275 bytes.
提供机构:
Mixture-of-tokenizers



