cosmos-imagenet

Name: cosmos-imagenet
Creator: maas
Published: 2025-12-05 16:41:09
License: 暂无描述

魔搭社区2025-12-05 更新2025-07-12 收录

下载链接：

https://modelscope.cn/datasets/fal/cosmos-imagenet

下载链接

链接失效反馈

官方服务：

资源简介：

# Tiny Cosmos-Tokenized Imagenet <p align="center"> <img src="https://cdn-uploads.huggingface.co/production/uploads/6311151c64939fabc00c8436/2Wrz6bzvwIHVATbtYAujs.png" alt="small" width="800"> </p> Similar fashion to [Simo's Imagenet.int8](https://github.com/cloneofsimo/imagenet.int8), here we provide [Cosmos-tokenized](https://github.com/NVIDIA/Cosmos-Tokenizer) imagenet for rapid prototyping. Noticeably, the discrete tokenizer is able to compress entire imagenet into **shocking 2.45 GB of data!** # How to use This time, we dumped it all on simple pytorch safetensor format. ```python import torch import torch.nn as nn from safetensors.torch import safe_open # for continuous tokenizer with safe_open("tokenize_dataset/imagenet_ci8x8.safetensors", framework="pt") as f: data = f.get_tensor("latents") * 16.0 / 255.0 labels = f.get_tensor("labels") print(data.shape) # 1281167, 16, 32, 32 print(labels.shape) # 1281167 ``` To decode, you would need to install cosmos tokenizer. ```bash git clone https://github.com/NVIDIA/Cosmos-Tokenizer.git cd Cosmos-Tokenizer apt-get install -y ffmpeg pip install -e . ``` And decode using either `"Cosmos-Tokenizer-CI8x8"` or `"Cosmos-Tokenizer-DI8x8"` **IMPORTANT** * For continuous token, we've quantized & normalized to int8 format. Thus, you need to multiply 16.0 / 255.0 * For discrete token, saved format is int16. To use it properly just do uint16. Example below: ```python model_name = "Cosmos-Tokenizer-CI8x8" if is_continuous else "Cosmos-Tokenizer-DI8x8" decoder = ImageTokenizer( checkpoint_dec=f"pretrained_ckpts/{model_name}/decoder.jit" ).to(device) with safe_open("imagenet_ci8x8.safetensors", framework="pt") as f: if tokenizer_type == "continuous": data = f.get_tensor("latents").to(torch.bfloat16) * 16.0 / 255.0 else: data = f.get_tensor("indices").to(torch.uint16) labels = f.get_tensor("labels") data = data[:1] if is_continuous: data = data.reshape(1, 16, 32, 32).to(device) else: # For discrete tokenizer, reshape to [1, 32, 32] data = data.reshape(1, 32, 32).to(device).long() # Decode the image with torch.no_grad(): reconstructed = decoder.decode(data) img = ( ((reconstructed[0].cpu().float() + 1) * 127.5).clamp(0, 255).to(torch.uint8) ) img = img.permute(1, 2, 0).numpy() img = Image.fromarray(img) ```

# 小型Cosmos分词化ImageNet数据集 <p align="center"><img src="https://cdn-uploads.huggingface.co/production/uploads/6311151c64939fabc00c8436/2Wrz6bzvwIHVATbtYAujs.png" alt="数据集示例图" width="800"></p> 本数据集参考[Simo's Imagenet.int8](https://github.com/cloneofsimo/imagenet.int8)的构建思路，旨在为快速原型开发提供经[Cosmos分词器（Cosmos-Tokenizer）](https://github.com/NVIDIA/Cosmos-Tokenizer)处理的ImageNet数据集。值得注意的是，该离散分词器（discrete tokenizer）可将完整ImageNet数据集压缩至仅**2.45 GB**，体量令人惊叹！ # 使用方法本次发布的数据集采用轻量化PyTorch SafeTensor格式存储。 python import torch import torch.nn as nn from safetensors.torch import safe_open # 针对连续分词器（continuous tokenizer） with safe_open("tokenize_dataset/imagenet_ci8x8.safetensors", framework="pt") as f: data = f.get_tensor("latents") * 16.0 / 255.0 labels = f.get_tensor("labels") print(data.shape) # 输出：(1281167, 16, 32, 32) print(labels.shape) # 输出：(1281167,) 若需对数据进行解码，需先安装Cosmos分词器。 bash git clone https://github.com/NVIDIA/Cosmos-Tokenizer.git cd Cosmos-Tokenizer apt-get install -y ffmpeg pip install -e . 可使用`"Cosmos-Tokenizer-CI8x8"`或`"Cosmos-Tokenizer-DI8x8"`两种模型进行解码。 **重要提示** * 对于连续型Token，数据集已完成int8量化与归一化处理，因此需通过`16.0 / 255.0`进行反归一化还原。 * 对于离散型Token，数据集采用int16格式存储，正确使用时需转换为uint16类型，示例如下： python model_name = "Cosmos-Tokenizer-CI8x8" if is_continuous else "Cosmos-Tokenizer-DI8x8" decoder = 图像分词器（ImageTokenizer）( checkpoint_dec=f"pretrained_ckpts/{model_name}/decoder.jit" ).to(device) with safe_open("imagenet_ci8x8.safetensors", framework="pt") as f: if tokenizer_type == "continuous": data = f.get_tensor("latents").to(torch.bfloat16) * 16.0 / 255.0 else: data = f.get_tensor("indices").to(torch.uint16) labels = f.get_tensor("labels") data = data[:1] if is_continuous: data = data.reshape(1, 16, 32, 32).to(device) else: # 针对离散分词器，需将张量重塑为[1, 32, 32] data = data.reshape(1, 32, 32).to(device).long() # 执行图像解码 with torch.no_grad(): reconstructed = decoder.decode(data) img = ( ((reconstructed[0].cpu().float() + 1) * 127.5).clamp(0, 255).to(torch.uint8) ) img = img.permute(1, 2, 0).numpy() img = Image.fromarray(img)

提供机构：

maas

创建时间：

2025-07-07

5,000+

优质数据集

54 个

任务类型

进入经典数据集