CODa Re-identification (Re-ID) Dataset

Name: CODa Re-identification (Re-ID) Dataset
Creator: Texas Data Repository
Published: 2026-03-30 02:35:13
License: 暂无描述

DataCite Commons2026-03-30 更新2026-05-05 收录

下载链接：

https://dataverse.tdl.org/citation?persistentId=doi:10.18738/T8/E9WFTW

下载链接

链接失效反馈

官方服务：

资源简介：

<h1>Introduction</h1> CODa Re-ID (Campus Object Dataset — Re-Identification) is a large-scale, real-world object re-identification dataset created to improve long-term robot perception and navigation in outdoor environments. It contains ≈<b>1.04M</b> per-frame 2D observations of <b>557</b> globally-unique static objects across <b>8</b> categories (trees, poles, bollard, informational signs, traffic signs, fire hydrants, emergency phone, and trash can) captured across multiple robot traversals on a university campus under varying lighting, weather, and viewpoint conditions. <br> <br> Highlights: <ul> <li>Globally consistent instance IDs: each physical object is assigned a persistent global identifier tracked across all sequences.</li> <li>Multi-modal annotations: per-frame 2D bounding boxes, segmentation contour, global 3D bounding boxes, and per-frame camera poses/timestamps.</li> <li>Diverse conditions: captures span multiple times of day and weather (sunny, overcast, rainy, low-light/night) and a wide range of viewing distances and angles.</li> <li>Reproducible pipeline: global trajectory alignment → 3D instance annotation → projection to images → automatic mask generation (SAM) → manual verification.</li> <li>Evaluation ready: standardized train/val/test splits and baseline evaluation recipes (code & checkpoints included in the repo).</li> <li>Intended for object re-identification, object-centric SLAM, long-term localization, and persistent scene understanding research.</li> <img src="https://dataverse.tdl.org/api/access/datafile/778872" alt="dataset_overall.png"> <h2>Content</h2> The dataset contains (summary): <ul> <li>≈ <b>1.04M</b> per-frame 2D observations linked to <b>557</b> globally unique 3D object instances.</li> <li>Modalities: RGB images, per-frame 2D bboxes, per-frame segmentation masks (palette PNGs), projected global 3D bboxes, camera poses, timestamps, and split lists (train/val/test).</li> <li>Split policy: sequences split by time/trajectory to ensure evaluation on temporally and spatially distinct holds (split files included).</li></ul> <h2>Collection Method</h2> <p>CODa Re-ID is derived by post-processing the original CODa dataset. From the 53 categories of objects identified in CODa we re-identified only eight categories: trees, poles, bollard, informational signs, traffic signs, fire hydrants, emergency phone, and trash can. </p> <p>We first align robot trajectories using LiDAR–inertial odometry with manual loop closures to obtain a single global frame. We then cluster the CODa-provided 3D bounding-box annotations along the aligned trajectories to identify persistent 3D objects. For each object we record a global 3D bounding box and observation metadata, and project these into every image frame using CODa camera intrinsics and poses. The capture sessions include repeated campus traversals, so the same object is observed under varied weather, illumination, and viewpoints. </p> <h2>Annotations</h2> Annotations were created with a semi-automated pipeline: <ol> <li>Global 3D instance annotation and clustering across aligned trajectories to produce a unique <code>instance_id</code> per physical object.</li> <li>Projection of global 3D boxes into all image frames to generate per-frame localization priors.</li> <li>Automatic mask generation using Segment Anything (SAM) with projected boxes/points as prompts, followed by box re-fitting to mask contours.</li> <li>Targeted manual verification and correction for low-confidence/flagged frames (sampled spot checks and flagged items corrected by annotators).</li> </ol> Each frame’s annotation is exported as a JSON record (image path, timestamp, camera pose, weather condition, instances). <br> Example of segmentation & projected 3D annotation: <img src="https://dataverse.tdl.org/api/access/datafile/778871" alt="annotation_example.png"> <h2>Evaluation</h2> We provide standardized train/val/test splits and baseline results (e.g., CLOVER-based re-id baselines) plus the official training/evaluation scripts. Configuration files, example commands, and pretrained checkpoints are included in the repository to enable straightforward reproduction of reported baselines. <h2>Dataset Organization</h2> <li>annotations/ — per-frame annotation(2D bounding box, segmentation contour) files organized by camera and sequence: {cam}/{sequence}/{frame}.json</li> <li>3d_bbox/global/ — one JSON per object class in the global frame: {object}.json.</li> <li>LICENSE — usage terms</li> <img src="https://dataverse.tdl.org/api/access/datafile/778870" alt="codare-id_data_structure.png"> <h2>Dataset Quality Statement</h2> Data quality was maintained via a reproducible pipeline and a QA process that includes automated validators (missing files, empty masks, box/mask IoU checks) and manual spot checks on sampled observations. Known limitations (documented in the full report) include occasional projection error in extreme foreshortening and noisy weather labels; validation scripts are provided to re-run quality checks. <h2>Further Information</h2> See the full Dataset Report included with this release for detailed schema, collection logs, calibration files, validation scripts, and reproducibility instructions. <h2>Download Dataset</h2> <ol> <li>Download 3D annotation files (*.json) in the `data/3d_bbox/global` directory.</li> <li>Download 2D annotation file (annotations.tar.gz) in the `data` and unzip the tar file.</li> <pre><code>tar -xvzf annotations.tar.gz</code></pre> <li>Follow <a href=https://dataverse.tdl.org/dataset.xhtml?persistentId=doi:10.18738/T8/BBOQMV>CODa instruction</a> to download CODa image files</li> </ol>

提供机构：

Texas Data Repository

创建时间：

2024-07-23

5,000+

优质数据集

54 个

任务类型

进入经典数据集