travisdriver/astrovision-data

Name: travisdriver/astrovision-data
Creator: travisdriver
Published: 2024-02-28 22:24:31
License: 暂无描述

Hugging Face2024-02-28 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/travisdriver/astrovision-data

下载链接

链接失效反馈

官方服务：

资源简介：

--- pretty_name: AstroVision viewer: false --- <a href="https://imgur.com/Vs3Rwin"><img src="https://i.imgur.com/Vs3Rwin.png" title="source: imgur.com" /></a> --- <a href="https://imgur.com/RjSwdG2"><img src="https://i.imgur.com/RjSwdG2.png" title="source: imgur.com" /></a> # About AstroVision is a first-of-a-kind, large-scale dataset of real small body images from both legacy and ongoing deep space missions, which currently features 115,970 densely annotated, real images of sixteen small bodies from eight missions. AstroVision was developed to facilitate the study of computer vision and deep learning for autonomous navigation in the vicinity of a small body, with speicial emphasis on training and evaluation of deep learning-based keypoint detection and feature description methods. If you find our datasets useful for your research, please cite the [AstroVision paper](https://www.sciencedirect.com/science/article/pii/S0094576523000103): ```bibtex @article{driver2023astrovision, title={{AstroVision}: Towards Autonomous Feature Detection and Description for Missions to Small Bodies Using Deep Learning}, author={Driver, Travis and Skinner, Katherine and Dor, Mehregan and Tsiotras, Panagiotis}, journal={Acta Astronautica: Special Issue on AI for Space}, year={2023}, volume={210}, pages={393--410} } ``` Please make sure to like the respository to show support! # Data format Following the popular [COLMAP data format](https://colmap.github.io/format.html), each data segment contains the files `images.bin`, `cameras.bin`, and `points3D.bin`, which contain the camera extrinsics and keypoints, camera intrinsics, and 3D point cloud data, respectively. - `cameras.bin` encodes a dictionary of `camera_id` and [`Camera`](third_party/colmap/scripts/python/read_write_model.py) pairs. `Camera` objects are structured as follows: - `Camera.id`: defines the unique (and possibly noncontiguious) identifier for the `Camera`. - `Camera.model`: the camera model. We utilize the "PINHOLE" camera model, as AstroVision contains undistorted images. - `Camera.width` & `Camera.height`: the width and height of the sensor in pixels. - `Camera.params`: `List` of cameras parameters (intrinsics). For the "PINHOLE" camera model, `params = [fx, fy, cx, cy]`, where `fx` and `fy` are the focal lengths in $x$ and $y$, respectively, and (`cx`, `cy`) is the principal point of the camera. - `images.bin` encodes a dictionary of `image_id` and [`Image`](third_party/colmap/scripts/python/read_write_model.py) pairs. `Image` objects are structured as follows: - `Image.id`: defines the unique (and possibly noncontiguious) identifier for the `Image`. - `Image.tvec`: $\mathbf{r}^\mathcal{C_ i}_ {\mathrm{BC}_ i}$, i.e., the relative position of the origin of the camera frame $\mathcal{C}_ i$ with respect to the origin of the body-fixed frame $\mathcal{B}$ expressed in the $\mathcal{C}_ i$ frame. - `Image.qvec`: $\mathbf{q}_ {\mathcal{C}_ i\mathcal{B}}$, i.e., the relative orientation of the camera frame $\mathcal{C}_ i$ with respect to the body-fixed frame $\mathcal{B}$. The user may call `Image.qvec2rotmat()` to get the corresponding rotation matrix $R_ {\mathcal{C}_ i\mathcal{B}}$. - `Image.camera_id`: the identifer for the camera that was used to capture the image. - `Image.name`: the name of the corresponding file, e.g., `00000000.png`. - `Image.xys`: contains all of the keypoints $\mathbf{p}^{(i)} _k$ in image $i$, stored as a ($N$, 2) array. In our case, the keypoints are the forward-projected model vertices. - `Image.point3D_ids`: stores the `point3D_id` for each keypoint in `Image.xys`, which can be used to fetch the corresponding `point3D` from the `points3D` dictionary. - `points3D.bin` enocdes a dictionary of `point3D_id` and [`Point3D`](third_party/colmap/scripts/python/read_write_model.py) pairs. `Point3D` objects are structured as follows: - `Point3D.id`: defines the unique (and possibly noncontiguious) identifier for the `Point3D`. - `Point3D.xyz`: the 3D-coordinates of the landmark in the body-fixed frame, i.e., $\mathbf{\ell} _{k}^\mathcal{B}$. - `Point3D.image_ids`: the ID of the images in which the landmark was observed. - `Point3D.point2D_idxs`: the index in `Image.xys` that corresponds to the landmark observation, i.e., `xy = images[Point3D.image_ids[k]].xys[Point3D.point2D_idxs[k]]` given some index `k`. These three data containers, along with the ground truth shape model, completely describe the scene. In addition to the scene geometry, each image is annotated with a landmark map, a depth map, and a visibility mask. <a href="https://imgur.com/DGUC0ef"><img src="https://i.imgur.com/DGUC0ef.png" title="source: imgur.com" /></a> - The _landmark map_ provides a consistent, discrete set of reference points for sparse correspondence computation and is derived by forward-projecting vertices from a medium-resolution (i.e., $\sim$ 800k facets) shape model onto the image plane. We classify visible landmarks by tracing rays (via the [Trimesh library](https://trimsh.org/)) from the landmarks toward the camera origin and recording landmarks whose line-of-sight ray does not intersect the 3D model. - The _depth map_ provides a dense representation of the imaged surface and is computed by backward-projecting rays at each pixel in the image and recording the depth of the intersection between the ray and a high-resolution (i.e., $\sim$ 3.2 million facets) shape model. - The _visbility mask_ provides an estimate of the non-occluded portions of the imaged surface. **Note:** Instead of the traditional $z$-depth parametrization used for depth maps, we use the _absolute depth_, similar to the inverse depth parameterization.

数据集名称：AstroVision 查看器：禁用 <a href="https://imgur.com/Vs3Rwin"><img src="https://i.imgur.com/Vs3Rwin.png" title="source: imgur.com" /></a> --- <a href="https://imgur.com/RjSwdG2"><img src="https://i.imgur.com/RjSwdG2.png" title="source: imgur.com" /></a> # 数据集简介 AstroVision是一款首创的大规模真实小天体(Small Body)图像数据集，数据来源于已退役及现役的深空探测任务(Deep Space Mission)。当前该数据集包含来自8项任务的16个小天体的115,970张密集标注真实图像。本数据集旨在助力针对小天体附近自主导航的计算机视觉与深度学习研究，重点关注基于深度学习的关键点检测(Keypoint Detection)与特征描述(Feature Description)方法的训练与评估。若您的研究中使用本数据集，请引用该[AstroVision学术论文](https://www.sciencedirect.com/science/article/pii/S0094576523000103)： bibtex @article{driver2023astrovision, title={{AstroVision}: Towards Autonomous Feature Detection and Description for Missions to Small Bodies Using Deep Learning}, author={Driver, Travis and Skinner, Katherine and Dor, Mehregan and Tsiotras, Panagiotis}, journal={Acta Astronautica: Special Issue on AI for Space}, year={2023}, volume={210}, pages={393--410} } 欢迎为本仓库点赞以示支持！ # 数据格式本数据集遵循主流的[COLMAP数据格式](https://colmap.github.io/format.html)，每个数据分段均包含`images.bin`、`cameras.bin`与`points3D.bin`三个文件，分别存储相机外参与关键点、相机内参以及三维点云数据。 - `cameras.bin`：存储`camera_id`与[`Camera`类](third_party/colmap/scripts/python/read_write_model.py)构成的字典。`Camera`对象的结构如下： - `Camera.id`：用于定义该相机的唯一标识符（标识符可能不连续）。 - `Camera.model`：相机模型。由于AstroVision数据集的图像均已完成畸变校正，本数据集采用**针孔(PINHOLE)**相机模型。 - `Camera.width`与`Camera.height`：传感器的像素宽度与高度。 - `Camera.params`：相机内参列表。对于针孔相机模型，`params = [fx, fy, cx, cy]`，其中`fx`与`fy`分别为x、y方向的焦距，(`cx`, `cy`)为相机的主点坐标。 - `images.bin`：存储`image_id`与[`Image`类](third_party/colmap/scripts/python/read_write_model.py)构成的字典。`Image`对象的结构如下： - `Image.id`：用于定义该图像的唯一标识符（标识符可能不连续）。 - `Image.tvec`：$mathbf{r}^mathcal{C_ i}_ {mathrm{BC}_ i}$，即相机坐标系$mathcal{C}_i$的原点相对于天体固连坐标系$mathcal{B}$的原点的相对位置，该位置在$mathcal{C}_i$坐标系下表示。 - `Image.qvec`：$mathbf{q}_ {mathcal{C}_ imathcal{B}}$，即相机坐标系$mathcal{C}_i$相对于天体固连坐标系$mathcal{B}$的相对姿态。用户可调用`Image.qvec2rotmat()`方法获取对应的旋转矩阵$R_ {mathcal{C}_ imathcal{B}}$。 - `Image.camera_id`：拍摄该图像所使用的相机的标识符。 - `Image.name`：对应图像文件的名称，例如`00000000.png`。 - `Image.xys`：存储图像$i$中的所有关键点$mathbf{p}^{(i)} _k$，格式为($N$, 2)的数组。本数据集中的关键点均为通过正向投影得到的形状模型顶点。 - `Image.point3D_ids`：为`Image.xys`中的每个关键点存储对应的`point3D_id`，可通过该ID从`points3D`字典中获取对应的三维点数据。 - `points3D.bin`：存储`point3D_id`与[`Point3D`类](third_party/colmap/scripts/python/read_write_model.py)构成的字典。`Point3D`对象的结构如下： - `Point3D.id`：用于定义该三维点的唯一标识符（标识符可能不连续）。 - `Point3D.xyz`：该地标在天体固连坐标系$mathcal{B}$下的三维坐标，即$mathbf{ell} _{k}^mathcal{B}$。 - `Point3D.image_ids`：观测到该地标图像的ID列表。 - `Point3D.point2D_idxs`：该地标在对应图像`Image.xys`中的索引，即对于给定索引`k`，有`xy = images[Point3D.image_ids[k]].xys[Point3D.point2D_idxs[k]]`。上述三个数据容器与真值形状模型共同完整描述了该观测场景。除场景几何信息外，每张图像还附带了地标图、深度图与可见性掩码三种标注数据。 <a href="https://imgur.com/DGUC0ef"><img src="https://i.imgur.com/DGUC0ef.png" title="source: imgur.com" /></a> - **地标图(Landmark Map)**：为稀疏匹配计算提供一套统一的离散参考点集，通过将中等分辨率（约80万个三角面）的形状模型顶点正向投影至图像平面生成。我们通过以下方式筛选可见地标：从地标向相机原点发射光线（借助[Trimesh库(Trimesh)](https://trimsh.org/)实现），仅保留视线光线未与三维模型相交的地标。 - **深度图(Depth Map)**：为成像表面提供稠密表示，通过对图像中每个像素反向投射光线，并记录光线与高分辨率（约320万个三角面）形状模型的交点深度生成。 - **可见性掩码(Visibility Mask)**：用于估计成像表面未被遮挡的区域。 **注意**：本数据集未采用深度图常用的$z$-深度参数化方式，而是采用类似逆深度参数化的**绝对深度(Absolute Depth)**进行表示。

提供机构：

travisdriver

原始信息汇总

关于

AstroVision是一个开创性的大规模数据集，包含来自八个任务的十六个小天体的真实图像，目前包含115,970张密集注释的真实图像。该数据集旨在促进计算机视觉和深度学习在小天体附近自主导航的研究，特别强调基于深度学习的特征点检测和描述方法的训练和评估。

数据格式

数据集遵循流行的COLMAP数据格式，每个数据段包含以下文件：

cameras.bin：编码一个camera_id和Camera对象对的字典。Camera对象结构如下：
- Camera.id：相机的唯一标识符。
- Camera.model：相机模型，使用"PINHOLE"模型。
- Camera.width & Camera.height：传感器宽度和高度（以像素为单位）。
- Camera.params：相机参数（内参），对于"PINHOLE"模型，params = [fx, fy, cx, cy]，其中fx和fy是$x$和$y$方向的焦距，(cx, cy)是相机的主点。
images.bin：编码一个image_id和Image对象对的字典。Image对象结构如下：
- Image.id：图像的唯一标识符。
- Image.tvec：相机帧$mathcal{C}_i$相对于体固定帧$mathcal{B}$的原点的相对位置，在$mathcal{C}_i$帧中表示。
- Image.qvec：相机帧$mathcal{C}i$相对于体固定帧$mathcal{B}$的相对方向。用户可以调用Image.qvec2rotmat()获取相应的旋转矩阵$R{mathcal{C}_imathcal{B}}$。
- Image.camera_id：用于捕获图像的相机的标识符。
- Image.name：相应文件的名称，例如00000000.png。
- Image.xys：图像$i$中的所有关键点$mathbf{p}^{(i)}_k$，存储为($N$, 2)数组。在本例中，关键点是前向投影的模型顶点。
- Image.point3D_ids：存储Image.xys中每个关键点的point3D_id，可用于从points3D字典中获取相应的point3D。
points3D.bin：编码一个point3D_id和Point3D对象对的字典。Point3D对象结构如下：
- Point3D.id：Point3D的唯一标识符。
- Point3D.xyz：地标在体固定帧中的三维坐标，即$mathbf{ell}_k^mathcal{B}$。
- Point3D.image_ids：观察到地标的图像的ID。
- Point3D.point2D_idxs：对应于地标观察的Image.xys中的索引，即给定某个索引k，xy = images[Point3D.image_ids[k]].xys[Point3D.point2D_idxs[k]]。

除了场景几何数据外，每张图像还带有地标图、深度图和可见性掩码的注释。

地标图：提供一组一致的离散参考点，用于稀疏对应计算，通过将中等分辨率（约800k个面）的形状模型的顶点前向投影到图像平面上得到。通过从地标向相机原点追踪光线（通过Trimesh库）并记录视线光线不与3D模型相交的地标来分类可见地标。
深度图：提供成像表面的密集表示，通过在图像中的每个像素上向后投影光线并记录光线与高分辨率（约320万个面）形状模型的交点深度来计算。
可见性掩码：提供成像表面非遮挡部分的估计。

注意： 与传统的$z$深度参数化不同，我们使用绝对深度，类似于逆深度参数化。

搜集汇总

数据集介绍

构建方式

AstroVision数据集的构建采用了集成太空任务中小型天体图像的创新方法，整合了来自八个任务的十六个小型天体的115,970张密集注释的实时图像。该数据集的构建专注于为计算机视觉和深度学习算法提供支持，特别是在深度学习基础上的关键点检测和特征描述方法训练与评估中，确保了图像数据的多样性和注释的详尽性。

特点

该数据集的特点在于其首次大规模提供了真实的小型天体图像，并附带了详细的注释信息，包括关键点、深度图和可见性掩码。图像注释的精确性为深度学习算法提供了可靠的基础，特别是在自主导航和表面特征识别方面，对于太空任务具有重要意义。

使用方法

使用AstroVision数据集时，研究者可以依据其遵循的COLMAP数据格式，通过解析包含相机内外参数、关键点和三维点云数据的二进制文件来获取所需信息。此外，数据集还提供了地面真实形状模型，以便于用户在研究过程中对照和验证算法性能。

背景与挑战

背景概述

AstroVision数据集是一款划时代的、大规模的真实小天体图像数据集，源自八个任务的十六个小天体的115,970张密集注释图像。该数据集由Travis Driver等研究人员于2023年开发，旨在促进计算机视觉和深度学习在接近小天体环境中的自主导航研究，特别是深度学习基础的关键点检测和特征描述方法的训练与评估。AstroVision的创建，为深空探测任务提供了重要的数据支撑，对空间任务自动化导航领域产生了显著影响。

当前挑战

AstroVision数据集在构建过程中遇到的挑战主要包括：一是如何处理和整合来自不同任务和不同时间点的图像数据，以确保数据的一致性和可用性；二是如何精确标注关键点，以满足深度学习模型的训练需求；三是数据集的规模和复杂性带来的存储和处理挑战。在所解决的领域问题上，AstroVision面临着如何在动态、复杂的环境中实现高精度特征检测和描述的挑战，这对于小天体的自主导航至关重要。

常用场景

经典使用场景

在深空探测领域，AstroVision数据集的诞生标志着计算机视觉与深度学习技术在自主导航中的应用迈出了关键一步。该数据集通过提供大量真实小天体图像，为研究者在小天体附近进行自主导航提供了宝贵的训练与评估资源，特别是在深度学习基础上的关键点检测与特征描述方法的研究中具有重要价值。

衍生相关工作

基于AstroVision数据集，学术界衍生出了一系列相关工作，如小天体表面形态的三维重建、深度学习在空间目标识别中的应用等。这些研究不仅推动了计算机视觉技术在航天领域的应用，也为深空探测提供了新的研究视角和方法论。

数据集最近研究