Embedding-Based Representations for BRSET and mBRSET

Name: Embedding-Based Representations for BRSET and mBRSET
Creator: PhysioNet
Published: 2026-03-30 23:37:09
License: 暂无描述

DataCite Commons2026-03-30 更新2026-05-04 收录

下载链接：

https://physionet.org/content/embedding-brset-mbrset/

下载链接

链接失效反馈

官方服务：

资源简介：

BRSET and mBRSET are publicly available Brazilian ophthalmological datasets composed of curated retinal fundus photographs with associated clinical and demographic information. While these resources enable diverse research applications, training deep learning models directly on high-resolution images is computationally intensive and often restricted by privacy regulations limiting the circulation of identifiable medical images. To address these challenges and facilitate equitable reuse, this project provides a comprehensive release of precomputed image embeddings for both datasets. These representations were generated using state-of-the-art vision backbones: DINOv3 ViT-S/16 (384-d) and ViT-B/16 (768-d) for transformer-based features, alongside ConvNeXt-Tiny (768-d) and ConvNeXt-Base (1024-d) for convolutional features. All models were applied in inference-only mode with a standardized preprocessing pipeline. Each fundus photograph was converted into a fixed- length numerical vector and exported as a CSV file, where each row corresponds to a single image and its respective embedding. These representations preserve critical semantic and structural information, enabling downstream tasks such as clustering, similarity search, multimodal modeling, disease classification, and fairness assessment without requiring raw pixel access. By providing scalable, privacy-preserving embeddings derived from Brazilian ophthalmic data, this resource reduces computational barriers, accelerates AI model development, and supports global research participation, particularly in low- resource environments, ensuring that advanced ophthalmic AI tools are accessible to a broader scientific community.

提供机构：

PhysioNet

创建时间：

2026-02-17

5,000+

优质数据集

54 个

任务类型

进入经典数据集