BoyaWu10/Bunny-v1_0-data
收藏Hugging Face2024-06-11 更新2024-04-19 收录
下载链接:
https://hf-mirror.com/datasets/BoyaWu10/Bunny-v1_0-data
下载链接
链接失效反馈官方服务:
资源简介:
Bunny-v1.0数据集是Bunny-v1.0系列模型的训练数据集,包括Bunny-v1.0-3B。该数据集的预训练数据来源于LAION-2B的高质量核心集,经过去重和信息丰富化处理后,随机选取了200万张图像-文本对进行训练。微调数据则是通过修改SVIT-mix-665K数据集构建的Bunny-695K。数据集的使用方法包括下载图像包并合并解压。整个项目内容遵循Apache 2.0许可证。
The Bunny-v1.0 dataset is created for training the Bunny-v1.0 series models, including Bunny-v1.0-3B. It is used for visual question answering and question answering tasks, containing 2 million randomly sampled image-text pairs from a high-quality coreset of LAION-2B. The dataset is divided into pretrain and finetune parts, stored in `pretrain` and `finetune` folders respectively. The images in the dataset are packed into multiple parts, and users need to download, merge, and unpack them. The dataset is licensed under Apache 2.0.
提供机构:
BoyaWu10
原始信息汇总
数据集概述
数据集名称
- Bunny-v1.0-3B
相关资源
- Technical report
- Code
- Demo



