有没有相关的论文或文献参考?
这个数据集是基于什么背景创建的?
数据集的作者是谁?
能帮我联系到这个数据集的作者吗?
这个数据集如何下载?
数据集名称 | 图像数量 | 图像分辨率 | VLMs |
---|---|---|---|
UCM-Captions | 613 | 256 × 256 | - |
Sydney-Captions | 2,100 | 500 × 500 | - |
RSICD | 10,921 | 224 × 224 | - |
RSITMD | 4,743 | 256 × 256 | - |
NWPU-Captions | 31,500 | 256 × 256 | - |
RS5M | 5 million+ | 所有分辨率 | GeoRSCLIP |
SkyScript | 5.2 million+ | 所有分辨率 | SkyCLIP |
论文 | 标题 | 出版物 | 机构 | 代码 | 备注 |
---|---|---|---|---|---|
CDMAN | Thread the Needle: Cues-Driven Multi-Association for Remote Sensing Cross-Modal Retrieval | TGRS 2024 | Wuhan University of Technology | - | |
MSA | Transcending Fusion: A Multiscale Alignment Method for Remote Sensing Image–Text Retrieval | TGRS 2024 | Xidian University | Github | |
KTIR | Knowledge-aware Text-Image Retrieval for Remote Sensing Images | TGRS 2024 | EPFL | - | |
CMPAGL | Cross-Modal Prealigned Method With Global and Local Information for Remote Sensing Image and Text Retrieval | TGRS 2024 | Shanghai Maritime University | Github | |
FGIS | Fine-Grained Information Supplementation and Value-Guided Learning for Remote Sensing Image-Text Retrieval | JSTARS 2024 | Chongqing University | - | |
EBAKER | Eliminate Before Align: A Remote Sensing Image-Text Retrieval Framework with Keyword Explicit Reasoning | ACMMM 2024 | Tianjin University | - | |
CUP | Cross-Modal Remote Sensing Image–Text Retrieval via Context and Uncertainty-Aware Prompt | TNNLS 2024 | Xidian University | Github | |
CCLS2T | Cross-Modal Contrastive Learning With Spatiotemporal Context for Correlation-Aware Multiscale Remote Sensing Image Retrieval | TGRS 2024 | Xidian University | - | |
MIIA | Global–Local Information Soft-Alignment for Cross-Modal Remote-Sensing Image–Text Retrieval | TGRS 2024 | Northwestern Polytechnical University | - | |
SARCI | Scale-Aware Adaptive Refinement and Cross-Interaction for Remote Sensing Audio-Visual Cross-Modal Retrieval | TGRS 2024 | Wuhan University of Technology | Github | |
GLISA | Masking-Based Cross-Modal Remote Sensing Image–Text Retrieval via Dynamic Contrastive Learning | TGRS 2024 | China University of Mining and Technology | - | |
SCAT | Spatial–Channel Attention Transformer With Pseudo Regions for Remote Sensing Image-Text Retrieval | TGRS 2024 | Northwestern Polytechnical University | - | |
FSISR | Cross-Modal Hashing With Feature Semi-Interaction and Semantic Ranking for Remote Sensing Ship Image Retrieval | TGRS 2024 | Harbin Institute of Technology | - | |
SkyEyeGPT | Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model | Arxiv 2024 | Northwestern Polytechnical University | Github | |
MFF-SFE | Cross-modal retrieval method based on MFF-SFE for remote sensing image-text | 中国科学院大学学报 2024 | Aerospace Information Research Institute, Chinese Academy of Sciences | - | |
RemoteCLIP | RemoteCLIP: A Vision Language Foundation Model for Remote Sensing | TGRS 2024 | Hohai University | Github | |
C2F-ITR | From Coarse To Fine: An Offline-Online Approach for Remote Sensing Cross-Modal Retrieval | IGARSS 2024 | Beijing Foreign Studies University | - | |
MGRM-EL | Exploring Uni-Modal Feature Learning on Entities and Relations for Remote Sensing Cross-Modal Text-Image Retrieval | TGRS 2024 | Northwestern Polytechnical University | - | |
SIRS | Multitask Joint Learning for Remote Sensing Foreground-Entity Image–Text Retrieval | TGRS 2024 | Soochow University | Github | |
PIR | A Prior Instruction Representation Framework for Remote Sensing Image-text Retrieval | ACMMM 2023 oral | Zhejiang University of Technology | Github | |
PE-RSITR | Parameter-Efficient Transfer Learning for Remote Sensing Image–Text Retrieval | TGRS 2023 | Northwestern Polytechnical University | Github | |
HVSA | Hypersphere-Based Remote Sensing Cross-Modal Text–Image Retrieval via Curriculum Learning | TGRS 2023 | Aerospace Information Research Institute, Chinese Academy of Sciences | Github | |
SWAN | Reducing Semantic Confusion Scene-aware Aggregation Network for Remote Sensing Cross-modal Retrieval | ICMR 2023 oral | Zhejiang University of Technology | Github | |
KAMCL | Knowledge-Aided Momentum Contrastive Learning for Remote-Sensing Image Text Retrieval | TGRS 2023 | Tianjin University | Github | |
IEFT | Interacting-Enhancing Feature Transformer for Cross-Modal Remote-Sensing Image and Text Retrieval | TGRS 2023 | Xidian University | Github | |
Multilanguage Transformer | Multilanguage Transformer for Improved Text to Remote Sensing Image Retrieval | JSTARS 2022 | King Saud University | - | |
GaLR | Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information | TGRS 2022 | Aerospace Information Research Institute, Chinese Academy of Sciences | Github | |
AMFMN | Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval | TGRS 2021 | Aerospace Information Research Institute, Chinese Academy of Sciences | Github | |
LW-MCR | A Lightweight Multi-Scale Crossmodal Text-Image Retrieval Method in Remote Sensing | TGRS 2021 | Aerospace Information Research Institute, Chinese Academy of Sciences | Github | |
VSE++ | VSE++: Improving Visual-Semantic Embeddings with Hard Negatives | BMVC 2018 spotlight | University of Toronto | Github |
缩写 | 标题 | 出版物 | 论文 | 代码与权重 |
---|---|---|---|---|
GeoKR | Geographical Knowledge-Driven Representation Learning for Remote Sensing Images | TGRS2021 | GeoKR | link |
GASSL | Geography-Aware Self-Supervised Learning | ICCV2021 | GASSL | link |
缩写 | 标题 | 出版物 | 论文 | 代码与权重 |
---|---|---|---|---|
RSGPT | RSGPT: A Remote Sensing Vision Language Model and Benchmark | Arxiv2023 | RSGPT | link |
RemoteCLIP | RemoteCLIP: A Vision Language Foundation Model for Remote Sensing | Arxiv2023 | RemoteCLIP | link |
GeoRSCLIP | RS5M: A Large Scale Vision-Language Dataset for Remote Sensing Vision-Language Foundation Model | Arxiv2023 | GeoRSCLIP | link |
GRAFT | Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment | ICLR2024 | GRAFT | - |
缩写 | 标题 | 出版物 | 论文 | 代码与权重 |
---|---|---|---|---|
CSP | CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations | ICML2023 | CSP | link |
GeoCLIP | GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization | NeurIPS2023 | GeoCLIP | link |
SatCLIP | SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery | Arxiv2023 | SatCLIP | link |
URPC系列数据集, S-URPC2019, UDD
URPC系列数据集包括URPC2017至URPC2020DL,主要用于水下目标的检测和分类。S-URPC2019专注于水下环境的特定检测任务。UDD数据集信息未在README中详细描述。
github 收录
CatMeows
该数据集包含440个声音样本,由21只属于两个品种(缅因州库恩猫和欧洲短毛猫)的猫在三种不同情境下发出的喵声组成。这些情境包括刷毛、在陌生环境中隔离和等待食物。每个声音文件都遵循特定的命名约定,包含猫的唯一ID、品种、性别、猫主人的唯一ID、录音场次和发声计数。此外,还有一个额外的zip文件,包含被排除的录音(非喵声)和未剪辑的连续发声序列。
huggingface 收录
Subway Dataset
该数据集包含了全球多个城市的地铁系统数据,包括车站信息、线路图、列车时刻表、乘客流量等。数据集旨在帮助研究人员和开发者分析和模拟城市交通系统,优化地铁运营和乘客体验。
www.kaggle.com 收录
Wind Turbine Data
该数据集包含风力涡轮机的运行数据,包括风速、风向、发电量等参数。数据记录了多个风力涡轮机在不同时间点的运行状态,适用于风能研究和风力发电系统的优化分析。
www.kaggle.com 收录
China Air Quality Historical Data
该数据集包含了中国多个城市的空气质量历史数据,涵盖了PM2.5、PM10、SO2、NO2、CO、O3等污染物浓度以及空气质量指数(AQI)等信息。数据按小时记录,提供了详细的空气质量监测数据。
www.cnemc.cn 收录