University of Texas Scalable, Adaptive, and Resilient Autonomy Collaborative Research Alliance-Grace Quarters (UT-SARA-GQ) Dataset
收藏DataCite Commons2026-04-02 更新2026-05-05 收录
下载链接:
https://dataverse.tdl.org/citation?persistentId=doi:10.18738/T8/LGV3UO
下载链接
链接失效反馈官方服务:
资源简介:
<h1>Introduction</h1>
<p><b>UT-SARA-GQ</b> combines the abbreviations <b>UT</b> (University of Texas), <b>SARA</b> (Scalable, Adaptive, and Resilient Autonomy Collaborative Research Alliance), and <b>GQ</b> (Grace Quarters). It is an off-road aerial-ground localization dataset collected with a Clearpath Warthog robot in the Grace Quarters environment of the UT SARA test site, an outdoor field site used for autonomy research.</p>
<p>This dataset was created to support research on <b>GPS-denied localization</b> for off-road ground robots. The main goal is to study whether a robot can estimate its location and heading by matching what it sees on the ground to a geo-referenced aerial map, even in environments with tree canopy, shadows, vegetation, and visually repetitive terrain where GPS can be unreliable.</p>
<p>The dataset contains the materials needed for that task: <b>raw ROS 2 bag recordings</b>, <b>processed per-run sequences</b>, <b>camera calibration files</b>, and <b>geo-referenced aerial orthophoto maps</b>. The processed portion of the dataset was used to train and evaluate localization models such as <b>BEV-Patch-PF</b>. In that workflow, the model uses rectified ground observations from each run together with an aerial GeoTIFF map to estimate the robot’s <b>3-DoF pose</b> (x, y, heading) in a <b>UTM</b> coordinate frame. The raw ROS 2 bagfiles preserve the original acquisition record; the processed runs provide the benchmark-ready data used for model training and evaluation.</p>
<p>Each processed run includes rectified stereo images, depth data, GPS, odometry, and UTM-frame pose information. This makes the dataset reusable not only for aerial-ground localization, but also for related work on cross-view matching, visual localization, map alignment, odometry comparison, and off-road robot navigation.</p>
<p>The released processed benchmark contains <b>15 trajectories</b> totaling <b>8.3 km</b> and approximately <b>60,000 frames</b>. The standard split uses <b>9 training</b>, <b>2 validation</b>, and <b>4 test</b> trajectories. Aerial map files are provided as north-up <b>GeoTIFF orthophotos</b> in <b>UTM zone 18N</b>.</p>
<img src="https://dataverse.tdl.org/api/access/datafile/994598" alt="Localization workflow using aerial map and ground observations">
<p><b>Figure:</b> Example localization workflow supported by this dataset. Ground observations from the robot are aligned with aerial map imagery to estimate pose over time.</p>
<h2>Dataset Access</h2>
<p>Because the full dataset is large, the files are hosted on <b>TACC's storage resource Corral</b>. </p>
<ul>
<li><b>Dataset download:</b> <a href="https://web.corral.tacc.utexas.edu/texasrobotics/web_UT-SARA-GQ/">UT-SARA-GQ dataset download (TACC)</a></li>
<li><b>Recommended subset for most users:</b> <code>processed/</code>, <code>calibration/</code>, and <code>maps/</code></li>
<li><b>Raw-data subset:</b> <code>bagfiles/</code>, for users who want the original recordings or want to reproduce the preprocessing workflow from scratch</li>
</ul>
<p><b>Importance of the processed files:</b> the <code>processed/</code> folder is the most immediately reusable part of the dataset. It contains synchronized, corrected, and map-aligned sequences in the format used for model training and evaluation. Most users do not need to start from the raw bagfiles unless they want to test a different preprocessing pipeline.</p>
<h2>Content</h2>
<p>The dataset is organized into the following top-level folders:</p>
<ul>
<li><b>bagfiles/</b> — ROS 2 bag recordings for each run</li>
<li><b>processed/</b> — ready-to-use processed runs derived from the raw recordings</li>
<li><b>maps/</b> — geo-referenced aerial orthophoto maps in GeoTIFF format</li>
<li><b>calibration/</b> — post-calibrated camera intrinsics and extrinsics used to generate the released processed data</li>
</ul>
<p><b>ROS 2 bagfiles.</b> The <code>bagfiles/</code> directory contains the original robot recordings. In ROS 2, a bag recording is typically stored as a folder containing a <code>metadata.yaml</code> file and one or more <code>.db3</code> files. These files preserve the original time-stamped sensor and robot-state messages. They can be inspected with standard ROS 2 tools such as <code>ros2 bag info</code> and <code>ros2 bag play</code>, or extracted with the preprocessing software linked in the Software section.</p>
<p><b>Processed runs.</b> Each timestamped folder under <code>processed/</code> corresponds to one recorded run (trajectory). These processed runs are the main files intended for reuse, because they convert the raw ROS 2 recordings into a simpler per-run format suitable for training and evaluating localization models.</p>
<p>A typical processed run has the following structure:</p>
<pre><code>processed/
└── 2024-08-15-12-40-46/
├── 2d_rect/
│ ├── cam_left/
│ ├── cam_right/
│ └── timestamps.txt
├── depth/
├── cuvslam.csv
├── fast_lio.csv
├── fast_lio_aligned.csv
├── gps.csv
├── odom.csv
└── utm_pose.csv
</code></pre>
<p>Representative files in each processed run include:</p>
<ul>
<li><b>2d_rect/cam_left/</b> and <b>2d_rect/cam_right/</b> — offline-rectified left and right image sequences</li>
<li><b>2d_rect/timestamps.txt</b> — timestamps for the rectified image frames</li>
<li><b>depth/</b> — per-frame depth data associated with the run</li>
<li><b>gps.csv</b> — recorded GPS measurements</li>
<li><b>odom.csv</b> — odometry measurements recorded during the run</li>
<li><b>cuvslam.csv</b> — visual odometry / SLAM output</li>
<li><b>fast_lio.csv</b> — FAST-LIO trajectory output</li>
<li><b>fast_lio_aligned.csv</b> — aligned FAST-LIO trajectory</li>
<li><b>utm_pose.csv</b> — geo-referenced pose trajectory in the UTM map frame</li>
</ul>
<h2>Collection and Processing</h2>
<p>Data were collected with a Clearpath Warthog ground robot operating on off-road trails with vegetation, canopy cover, and strong shadows. During each run, the robot recorded multi-modal sensor streams into ROS 2 bag files.</p>
<p>The released processed data were generated from the original recordings in four main stages:</p>
<ol>
<li><b>Bag extraction:</b> Sensor messages were extracted from the ROS 2 bag recordings.</li>
<li><b>Offline rectification and synchronization:</b> Image streams were corrected using the released post-calibrated camera parameters and then organized into synchronized per-run outputs.</li>
<li><b>Depth map generation:</b> Depth images were generated using <a href="https://nvlabs.github.io/FoundationStereo/">Foundation Stereo</a>.</li>
<li><b>Map-frame alignment:</b> Each run was aligned with the geo-referenced aerial map to generate UTM-frame pose data for localization experiments.</li>
</ol>
<h2>Known Sensor Issue</h2>
<p>The raw ROS 2 bagfiles preserve the sensor streams exactly as they were recorded. However, the stereo image topics in the original recordings were associated with an incorrect stereo calibration state at acquisition time. For that reason, the stereo image topics in the raw bagfiles should <b>not</b> be treated as the final benchmark-ready image product.</p>
<p>Instead, this dataset releases <b>post-calibrated camera intrinsics and extrinsics</b> in <code>calibration/</code>, and the rectified images in <code>processed/2d_rect/</code> were generated using those corrected calibration parameters.</p>
<h2>Dataset Organization</h2>
<p>Each processed sequence corresponds to a raw bag recording with the same timestamp-based identifier.</p>
<pre><code>UT-SARA-GQ/
├── bagfiles/
│ └── &lt;run_id&gt;/
│ ├── metadata.yaml
│ └── *.db3
├── processed/
│ └── &lt;run_id&gt;/
│ ├── 2d_rect/
│ │ ├── cam_left/
│ │ ├── cam_right/
│ │ └── timestamps.txt
│ ├── depth/
│ ├── cuvslam.csv
│ ├── fast_lio.csv
│ ├── fast_lio_aligned.csv
│ ├── gps.csv
│ ├── odom.csv
│ └── utm_pose.csv
├── maps/
└── calibration/
</code></pre>
<h2>Intended Use</h2>
<ul>
<li>Training and evaluating aerial-ground localization models in off-road, GPS-challenged environments</li>
<li>Benchmarking cross-view matching and sequential pose estimation methods</li>
<li>Studying the effects of canopy cover, shadows, and vegetation on localization performance</li>
<li>Comparing visual odometry, LiDAR odometry, GPS, and wheel/robot odometry signals within the same run</li>
<li>Reproducing or extending the preprocessing workflow from the raw ROS 2 recordings</li>
</ul>
<h2>Dataset Quality Statement</h2>
<p>Data quality is supported by a documented preprocessing pipeline and consistency checks during extraction, synchronization, and map alignment. Validation includes inspection of synchronized frames, verification of per-sequence trajectory files, and checks for missing or invalid data products.</p>
<p>Users should also be aware of the normal limitations of outdoor robotic data collection, including GPS noise, odometry drift, partial canopy occlusion, strong appearance changes, and mismatch between ground views and aerial imagery. In addition, the original bagfiles preserve an acquisition-time stereo calibration issue; the corrected and recommended image release is the <code>processed/</code> folder generated with the provided post-calibrated camera parameters.</p>
提供机构:
Texas Data Repository
创建时间:
2026-01-02



