NuTonic/sat-vl-sft-postprocessed-merged-v1
收藏Hugging Face2026-04-30 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/NuTonic/sat-vl-sft-postprocessed-merged-v1
下载链接
链接失效反馈官方服务:
资源简介:
`NuTonic/sat-bbox-metadata-sft-v1` 是一个**以元数据为先、程序化的VLM SFT数据集**,基于现有的“sat-bbox”风格数据集构建(包含Sentinel-2图像块和每块的JSON元数据文件,可选配Mapbox静态图像)。目标是创建**高信号、生产形态的监督**,用于多模态聊天模型:
- 卫星图像块的**标注**
- 土地覆盖区域的**定位**(归一化坐标中的边界框)
- 特定土地覆盖类别的**类聚焦标注**和**缺失检查**
- 使用可选Mapbox俯瞰上下文进行**跨视图推理**
- **类似生产的分析摘要**,包括:
- Sentinel-2图像
- 额外的程序化“分析图像”(类似TiM的预测类别栅格)
- 紧凑的**TiM形态分析JSON块**
- 特定配置的助手摘要(土地利用变化、野火、洪水脉冲等)
该数据集**不调用Mapbox API**,仅使用输入数据集根目录中已有的路径。
`NuTonic/sat-bbox-metadata-sft-v1` is a **metadata-first, procedural VLM SFT dataset** built from an existing “sat-bbox” style dataset tree (Sentinel‑2 chips + per-tile JSON metadata sidecars, optionally paired Mapbox stills). The goal is to create **high-signal, production-shaped supervision** for multimodal chat models:
- **Captioning** for satellite chips
- **Grounding** (bounding boxes in normalized coordinates) for land-cover regions
- **Class-focused captions** and **absence checks** for specific land-cover classes
- **Cross-view** reasoning using optional Mapbox overhead context
- **Production-like analytical summaries** that include:
- Sentinel‑2 imagery
- An additional procedural “analysis image” (a TiM-like predicted-class raster)
- A compact **TiM-shaped analytics JSON** block
- A profile-specific assistant summary (land use change, wildfire, flood pulse, etc.)
This dataset is generated **without calling Mapbox APIs**; it only uses paths that already exist in the input dataset root.
提供机构:
NuTonic



