rahuldshetty/satellite-multitask-omni
收藏Hugging Face2026-04-24 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/rahuldshetty/satellite-multitask-omni
下载链接
链接失效反馈官方服务:
资源简介:
卫星多任务全能数据集是一个统一的、多任务的卫星/航空影像数据集,旨在训练能够处理图像和文本作为输入和输出模态的全能模型。所有数据都转换为一致的ChatML对话格式。数据集包含34,894个样本,分为31,404个训练样本、1,744个验证样本和1,746个测试样本。涵盖了9种不同的任务类型,包括场景分类、图像描述生成、视觉问答、目标检测、实例分割等。数据集来源于10个不同的来源数据集,并转换为统一的ChatML对话格式。每个样本包含任务标识符、来源数据集名称、图像、对话内容和元数据等信息。对话内容遵循标准的ChatML格式,元数据则包含任务特定的结构化数据。数据集的设计适用于训练如GeoChat、SkyEyeGPT等模型,并提供了推荐的训练方法和参数。
The Satellite Multi-Task Omni Dataset is a unified, multi-task satellite/aerial imaging dataset designed for training omni-models that work with image+text as both input and output modalities. All data is converted to a consistent ChatML conversational format. The dataset contains 34,894 samples, divided into 31,404 training samples, 1,744 validation samples, and 1,746 test samples. It covers 9 distinct task types, including scene classification, image captioning, visual question answering, object detection, instance segmentation, etc. The dataset is sourced from 10 different source datasets and converted into a unified ChatML conversational format. Each sample includes a task identifier, source dataset name, image, conversation content, and metadata. The conversation content follows the standard ChatML format, while the metadata contains task-specific structured data. The dataset is designed for training models like GeoChat and SkyEyeGPT, and provides recommended training methods and parameters.
提供机构:
rahuldshetty



