POLAR-Sim: Augmenting NASA's POLAR dataset for data-driven lunar perception and rover simulation
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.ksn02v7hf
下载链接
链接失效反馈官方服务:
资源简介:
NASA's POLAR (Polar Optical Lunar Analog Reconstruction) dataset contains approximately 2,600 pairs of high dynamic range stereo photos captured across 12 varied terrain scenes, including areas with sparse or dense rock distributions, craters, and rocks of different sizes. The purpose of these photos is to spur research and development in robotics, AI-based perception, and autonomous navigation. Acknowledging a scarcity of lunar photos from around the lunar poles, NASA Ames produced on Earth but in controlled conditions, photos that resemble rover operating conditions from these regions of the Moon.
This dataset, named POLAR-Sim, provides bounding boxes and semantic segmentation information for all the photos in NASA's POLAR dataset. This effort results in 23,000 labels and semantic segmentation information pertaining to rocks and shadows of rocks. Furthermore, for each scene, we produced individual meshes associated with the ground and the rocks in each scene. This allows anyone with a camera model to generate synthetic images associated with any of the 12 scenarios of the POLAR dataset. Effectively, one can generate as many semantically labeled synthetic images as desired -- from different viewpoints in the scene, with different exposure values, for different positions of the Sun, with or without the presence of active illumination, etc.
The benefit of this work is twofold. Using outcomes of the photo annotations, one can train and/or test perception algorithms that deal with Moon photos. For meshes of the scenes, one can produce as much data as desired to train and test AI algorithms that are anticipated to be used in lunar conditions. All the outcomes of this work are available in a public repository for unfettered use and distribution.
Methods
Photo Bounding Box Annotation
To support the training of data-driven perception algorithms, we manually labeled bounding boxes for all of the rocks and rocks' shadows in the POLAR dataset. This effort was motivated by the observation that object detection for rocks and shadows plays an important role in autonomous navigation -- large rocks can block the rover's path, while medium and small rocks can damage the wheels or the chassis. Shadows also help estimate the Sun's position, which is vital for navigation planning, solar energy harvesting, and sensor orientation.
Approximately 23,000 rocks and rocks' shadows were labeled. Each photo's configuration includes the terrain ID, stereo camera position (A: 1.5 m from terrain center at 0 deg, B: 4 m from terrain center at 0 deg, or C: 1.5 m from terrain center at 280 deg), rover light status (ON or OFF), Sun azimuth angle (none, 30, 180, 270, or 350 degrees), stereo camera index (Left or Right), and exposure time (32 to 2048 ms), where "none" means no simulated Sun used. Each POLAR photo was taken under a combination of these configuration parameters. Note that the labels for the rocks and shadows remained the same for several POLAR photos. Specifically, exposure time variations did not alter the positions of the rocks and their shadows, so photos with exposure times of 32, 64, and 128 ms share the same labels. Other similarities include: different rover light statuses have the same rock and shadow labels, same camera positions with different Sun azimuths have the same rock labels but different shadow labels, and the Left and Right camera views have similar rock and shadow label positions. Leveraging these observations reduced the burden of labeling. Finally, since in the POLAR dataset the exposure time was controlled, at very low exposure time the shadow labels were deleted subjectively by the human annotator, since it was dark enough for the shadow to be judged as "invisible."
Annotation was performed using labelImg. For each terrain, the first Left stereo photo with rover light OFF was manually labeled. Labels were reused for photos with different exposure times, with shadow labels omitted for very low exposures due to their invisibility. The labels were then replicated and adjusted for the Right camera view with similar rock and shadow positions, and were replicated for photos with different Sun azimuths which only required shadow adjustments. Thereafter, labels were replicated and fine-tuned for photos with rover light ON. This process was repeated for all stereo camera positions (A, B, and C) and for all 13 POLAR terrain scenarios. The annotations were saved in YOLO format as TXT files.
Photo Semantic Segmentation Map Annotation
In POLAR-Sim, we provide semantic segmentation labels, manually annotating the background, ground, rocks, and rock shadows in the photos using Roboflow. A similar procedure to the bounding box annotation was applied, leveraging repeated segmentation across photos of similar configurations. For each terrain, the clearest photo with suitable brightness was selected for accurate annotation, refined after using the Segment Anything Model (SAM). The resulting segmentation files were then copied and fine-tuned for other photos of similar configurations. This method ensured consistent and high-quality segmentation across the dataset. Segmentation annotations were also saved in YOLO format as TXT files, with a converter program provided to output the conventional gray-scale map formats.
Mesh Construction of the Ground and Rocks
On the other hand, we produced mesh files of the ground and rocks for each terrain scenario of the POLAR dataset. Manually locating the rocks and generating surface meshes was performed in MATLAB, using the point cloud data provided in the POLAR dataset. First, for each terrain scenario, two point clouds, respectively scanned from camera positions A and C (positions were defined in the 2nd paragraph in the Photo Bounding Box Annotation section), were inversely transformed back to the sandbox coordinates (where +X corresponds to Sun azimuth 0 deg, +Y to 90 deg, and +Z is upward). To enhance the mesh completeness, e.g., mitigating voids and occlusions from a single viewpoint, and to better capture the 3D shape of each rock, the two point clouds were manually coarse-aligned and positioned at the origin. Each rock's point clouds were then further locally fine-aligned to minimize occlusions and recover as much of the rock's shape as possible. Finally, the annotators manually identified the X, Y, and Z coordinate ranges of the rock to form a bounding cuboid.
Once all the rocks in the terrain were located, the point clouds of the rocks and ground were separated. These separated point clouds were then converted into meshes using the Poisson Surface Reconstruction method in MATLAB, and the results were stored as OBJ files.
Simulation Videos
A collection of simulation videos is available. All videos were made at 0.5x speed. Videos are named in the following format: [camera viewpoint]_[Sun ID]_[BRDF]_ [exposure time].mp4. The [camera viewpoint] includes two third-person-view cameras from the left and right sides of the VIPER, CamBirdViewLeft and CamBirdViewRight, four wheel cameras, WheelCam_RightFront, WheelCam_RightBack, WheelCam_LeftFront, and WheelCam_LeftBack, and the front-end camera, front_end_cam, respectively. The [Sun ID] annotates the Sun's directions, where 1, 2, 3, and 4 represent East, Southeast, Southwest, and West, respectively. hapke or default in the [BRDF] specifies either the Hapke or Principled BRDF in Chrono::Sensor. And [exposure time] is set to 0256, 0512, or 1024 milliseconds.
创建时间:
2025-07-16



