five

edouard-rolland/multi-perspective-dataset-plain-zebras

收藏
Hugging Face2026-03-23 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/edouard-rolland/multi-perspective-dataset-plain-zebras
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 language: - en pretty_name: Multi-Perspective Dataset of Plains Zebras — FAIR² Multi-Drone Wildlife Monitoring Dataset task_categories: - object-detection - video-classification tags: - biology - ecology - wildlife-monitoring - drone - uav - aerial-imagery - zebra - kenya - savanna - multi-drone - synchronisation - telemetry size_categories: - 10K<n<100K description: Frame-level multi-drone telemetry dataset combining synchronised aerial video metadata, drone GPS tracks, camera parameters, and full flight-controller logs from wildlife monitoring of plains zebras in Kenya. Designed for multi-view geometry, synchronisation, and drone-based wildlife monitoring research. # FAIR² COMPLIANCE METADATA fair2_compliance: findable: doi: "" # To be assigned metadata_registry: ["DataCite", "GBIF"] accessible: open_access: true authentication_required: false interoperable: standards: ["Darwin Core", "TDWG", "FAIR2"] reusable: license_clear: true provenance_documented: true ai_ready: machine_readable: true structured_annotations: true # DARWIN CORE COMPLIANCE darwin_core: event_coverage: start_date: "2026-02-24" end_date: "2026-02-24" decimal_latitude: -0.006 decimal_longitude: 36.872 coordinate_uncertainty_meters: 5 locality: "Ol Pejeta Conservancy, Laikipia County, Kenya" habitat: "African savanna and open grassland" occurrence_info: kingdom: "Animalia" taxa_included: ["Equus quagga"] sampling_protocol: "Coordinated multi-drone aerial video survey at 30–75 m altitude with continuous recording and per-frame GPS telemetry" # PLATFORM SPECIFICATIONS platform: type: "UAV" manufacturer: "DJI" model: "Mini 4 Pro" autonomy_mode: "autonomous" # SENSOR SPECIFICATIONS sensors: - type: "RGB" manufacturer: "DJI" model: "Integrated 1/1.3\" CMOS camera" resolution: [3840, 2160] # MISSION PARAMETERS mission: altitude_m: [30, 45, 60, 75] speed_ms: "0–10" telemetry_available: true --- # Dataset Card for Multi-Perspective Dataset of Plains Zebras **Synchronised per-frame telemetry from four simultaneously operating drones, enabling research on multi-view wildlife monitoring, individual re-identification, 3D reconstruction, and multi-drone swarm survey protocols.** ## Dataset Details ### Dataset Description - **Curated by:** Edouard Rolland - **Authors:** Saadia Afridi, Steve Bullock, Alejandro Jarabo Penas, Lucie Laporte Devylder - **Language(s):** English (metadata and documentation) - **Repository:** [multi-perspective-dataset-plain-zebras](https://huggingface.co/datasets/edouard-rolland/multi-perspective-dataset-plain-zebras) - **Paper:** [Drone Swarms for Multi-perspective Monitoring of Large Mammals in their Natural Habitats: Deployment and Field Trials](https://link.springer.com/chapter/10.1007/978-3-032-07638-0_22) This dataset provides per-frame telemetry from a coordinated swarm of four simultaneously operating DJI Mini 4 Pro drones monitoring plains zebras (*Equus quagga*) at Ol Pejeta Conservancy, Laikipia County, Kenya. Collected on 24 February 2026, the dataset contains 11 synchronised video clips (86,985 frames) with complete GPS tracks, camera parameters, and full decoded DJI flight-controller logs. The dataset was developed to demonstrate the FAIR² Drones standard for coordinated multi-platform wildlife surveys. Each drone was assigned a fixed altitude (30, 45, 60, or 75 m AGL) to maximise complementary coverage: vertical monitoring for census and movement analysis, horizontal monitoring for individual identification via flank markings. No annotation labels are included — this is a raw multi-perspective telemetry dataset intended for synchronised multi-view wildlife monitoring research. Key features: - **4-drone synchronised session** covering 723 seconds of simultaneous recording - **11 per-drone video clips** with per-frame GPS, altitude, and camera EXIF - **62 flight-controller columns** per frame (attitude, velocity, battery, gimbal, RC signal) - **Cross-drone frame alignment table** with 21,747 rows at 100 ms tolerance - **Darwin Core event tables**: 11 video-level events + 1 session-level event - **Full processing pipeline** released as open-source Python/Node.js scripts - **No annotation labels** — raw multi-perspective telemetry for geometry and monitoring research ### Supported Tasks and Applications This dataset supports computer vision, ecological analysis, and autonomous systems research: **🤖 Computer Vision Tasks:** - Individual Re-identification (exploiting multi-perspective flank markings across the four drones) - Multi-Object Tracking (temporal consistency within and across drone views) - Object Detection (bounding box baselines at varying altitudes 30–75 m) - 3D Pose Estimation and Reconstruction (multi-view geometry from field-recorded extrinsics) **🌿 Ecological Applications:** - Group size and movement estimation from simultaneous multi-altitude aerial views - Altitude-dependent detection performance characterisation - Animal response to drone presence - Synchronised multi-perspective behavioural context reconstruction **🚁 Drone Systems Research:** - Multi-drone swarm synchronisation methods and evaluation - Cross-platform temporal alignment validation ## Dataset Structure ### Directory Organisation ``` multi-perspective-dataset-plain-zebras/ ├── data/ │ ├── raw/ │ │ └── mission_1/ # original 4K clips with DJI sidecar files │ │ ├── drone_1/ │ │ │ ├── DJI_20260224133555_0001_D.MP4 # 3.6 GB │ │ │ ├── DJI_20260224133555_0001_D.SRT │ │ │ ├── DJI_20260224133555_0001_D.LRF │ │ │ ├── DJI_20260224134121_0002_D.MP4 # 3.6 GB │ │ │ ├── DJI_20260224134121_0002_D.SRT │ │ │ ├── DJI_20260224134121_0002_D.LRF │ │ │ ├── DJI_20260224134647_0003_D.MP4 # 815 MB │ │ │ ├── DJI_20260224134647_0003_D.SRT │ │ │ └── DJI_20260224134647_0003_D.LRF │ │ ├── drone_2/ … (3 clips × {MP4, SRT, LRF}) │ │ ├── drone_3/ … (2 clips × {MP4, SRT, LRF}) │ │ └── drone_4/ … (3 clips × {MP4, SRT, LRF}) │ ├── occurrences/ │ │ └── mission_1/ # per-frame telemetry CSVs (11 files, 86,985 rows) │ │ ├── drone_1-DJI_20260224133555_0001_D.csv # 9,761 rows │ │ ├── drone_1-DJI_20260224134121_0002_D.csv # 9,770 rows │ │ ├── drone_1-DJI_20260224134647_0003_D.csv # 2,216 rows │ │ ├── drone_2-DJI_20260224133554_0002_D.csv # 9,757 rows │ │ ├── drone_2-DJI_20260224134119_0003_D.csv # 9,762 rows │ │ ├── drone_2-DJI_20260224134645_0004_D.csv # 2,231 rows │ │ ├── drone_3-DJI_20260224133556_0001_D.csv # 12,465 rows │ │ ├── drone_3-DJI_20260224134252_0002_D.csv # 9,274 rows │ │ ├── drone_4-DJI_20260224133554_0001_D.csv # 9,759 rows │ │ ├── drone_4-DJI_20260224134120_0002_D.csv # 9,762 rows │ │ └── drone_4-DJI_20260224134646_0003_D.csv # 2,228 rows │ ├── sync/ │ │ └── mission_1/ │ │ └── synchronized_frames.csv # full cross-drone sync table (21,747 rows × 93 cols) │ ├── trimmed/ │ │ └── mission_1/ │ │ ├── drone_1.mp4 # trimmed 4K video — common window │ │ ├── drone_2.mp4 │ │ ├── drone_3.mp4 │ │ ├── drone_4.mp4 │ │ └── synchronized_trim.csv # trim-frame index table (18,292 rows) │ ├── video_events.csv # Darwin Core — one row per clip (11 rows) │ └── session_events.csv # Darwin Core — one row per mission (1 row) ├── flight_logs/ │ └── mission_1/ │ └── raw.csv # decoded DJI v14 flight logs (34,954 rows × 62 cols) └── scripts/ # full processing pipeline ``` ### Data Instances **Occurrence Files** (`data/occurrences/mission_1/<drone_id>-<video_id>.csv`): Each CSV (84 columns) contains frame-by-frame records for one continuous video clip from one drone. The 11 files together cover 86,985 rows. **SRT-derived columns** (per frame, from DJI subtitle telemetry): | Field | Example Value | Description | |-------|---------------|-------------| | `occurrenceID` | `mission_1_drone_1_DJI_20260224133555_0001_D_1` | Unique occurrence identifier (`<mission>_<drone>_<video>_<frame>`) | | `eventID` | `multi-perspective-dataset-plain-zebras_mission_1_drone_1_…` | Darwin Core event identifier | | `mission` | `mission_1` | Mission label within the session | | `drone_id` | `drone_1` | Drone identifier (`drone_1` – `drone_4`) | | `video_id` | `DJI_20260224133555_0001_D` | DJI filename stem | | `frame` | `1` | 1-based frame index within the clip | | `srt_timecode` | `00:00:00,000` | SRT timecode (`HH:MM:SS,mmm`) | | `date_time` | `2026-02-24 13:35:56.003` | Frame UTC timestamp (`YYYY-MM-DD HH:MM:SS.mmm`) | | `sync_utc_ms` | `1771936556003` | Unix epoch milliseconds (UTC) — primary sync key | | `latitude` | `-0.007247` | Aircraft decimal latitude (WGS84, from SRT) | | `longitude` | `36.873998` | Aircraft decimal longitude (WGS84, from SRT) | | `rel_alt` | `29.7` | Altitude above takeoff point (m) | | `abs_alt` | `1941.856` | Altitude above sea level (m) | | `iso` | `110` | Camera ISO | | `shutter` | `1/2500.0` | Shutter speed | | `fnum` | `1.7` | Aperture f-number | | `ev` | `0` | Exposure value | | `color_md` | `default` | Colour mode | | `focal_len` | `24.0` | Focal length (mm, 35 mm equivalent) | | `ct` | `5517` | Colour temperature (K) | | `video_file` | `missions/mission_1/drones_videos/drone_1/DJI_…MP4` | Relative path to source MP4 | | `scientificName` | `Equus quagga` | Darwin Core scientific name | | `kingdom` | `Animalia` | Darwin Core kingdom | | `taxonRank` | `species` | Darwin Core taxon rank | **Flight-log-derived columns** (nearest-neighbour joined from raw flight log, ≤500 ms tolerance — 60 columns): | Field | Example Value | Description | |-------|---------------|-------------| | `fl_latitude` / `fl_longitude` | `-0.007244` / `36.873994` | Aircraft GPS from flight log (cross-check) | | `fl_altitude` | `1913.256` | GPS altitude ASL from flight log (m) | | `fl_height` | `1.1` | Barometric height AGL (m) | | `heightMax` | `31.5` | Maximum height reached in flight (m) | | `vpsHeight` | `0.0` | Vision Positioning System height (m, valid < 10 m) | | `xSpeed` / `ySpeed` / `zSpeed` | `0.0` / `0.0` / `0.0` | Body-frame velocity (m/s) | | `xSpeedMax` / `ySpeedMax` / `zSpeedMax` | `2.1` / `1.8` / `0.5` | Maximum recorded velocities (m/s) | | `pitch` / `roll` / `yaw` | `-0.9` / `1.7` / `-3.8` | Aircraft attitude (degrees) | | `flyTime` | `10.0` | Seconds since takeoff | | `flycState` | `GPSAtti` | Flight controller state | | `flycCommand` | — | Flight controller command | | `flightAction` | — | Current flight action | | `goHomeStatus` | — | Return-to-home status | | `isGpsUsed` | `True` | Whether GPS fix is active | | `gpsNum` | `32` | Number of GPS satellites tracked | | `gpsLevel` | `5` | GPS signal level (0–5) | | `droneType` | — | Aircraft model code | | `batteryPercent` | `92` | Battery charge level (%) | | `voltage` | `16.4` | Battery pack voltage (V) | | `batteryCurrent` | `2.1` | Current draw (A) | | `batteryCellVoltages` | `4.11;4.11;4.11;4.11` | Per-cell voltages (semicolon-separated, V) | | `batteryTemp` | `28.0` | Battery temperature (°C) | | `gimbalMode` | `YawFollow` | Gimbal mode | | `gimbalPitch` / `gimbalRoll` / `gimbalYaw` | `0.0` / `0.0` / `0.0` | Gimbal attitude (degrees) | | `rcUplinkSignal` / `rcDownlinkSignal` | `90.0` / `92.0` | RC link signal strength (%) | | `rcAileron` / `rcElevator` / `rcThrottle` / `rcRudder` | `1024` | RC stick positions (PWM, 1024 = centre) | | `isPhoto` / `isVideo` | `False` / `True` | Camera photo/recording state | | `homeLat` / `homeLon` / `homeAlt` | `-0.007257` / `36.873914` / `1941.8` | Home-point GPS coordinates | | `homeHeightLimit` | `120.0` | Maximum altitude limit (m) | | `homeGoHomeHeight` | `30.0` | Return-to-home altitude (m) | > **Note:** Flight-log columns are `NaN` for frames that fall outside the flight-log recording window (e.g. before arm or after video stop). All frames within the active flying window are matched at 100% in this dataset. **Naming Convention:** ``` {drone_id}-{video_id}.csv Example: drone_1-DJI_20260224133555_0001_D.csv └drone─┘ └──────────video_id──────────┘ ``` **Temporal Information:** - Date: 2026-02-24 (single-day survey) - Session start: 13:35:54 UTC+1 / 12:35:54 UTC - Session end: 13:48:01 UTC+1 / 12:48:01 UTC (727.2 s total) - Common trimmed recording window: 609.7 s (~10 min, 18,292 frames per drone) - Dry season, Laikipia County, Kenya **Synchronisation Table** (`data/sync/mission_1/synchronized_frames.csv`, 21,747 rows × 93 cols): Full cross-drone frame alignment using all four drones as co-anchored sources. Each row represents a ~33 ms tick. Columns follow the pattern `{drone_id}_{field}` for all SRT fields from each drone, plus `sync_utc_ms` as the join key. **Trimmed Synchronisation Table** (`data/trimmed/mission_1/synchronized_trim.csv`, 18,292 rows × 14 cols): Compact trim-frame index table for the common 4-drone recording window. Columns: `trim_frame`, `sync_utc_ms`, and `{drone_id}_video_id`, `{drone_id}_frame`, `{drone_id}_srt_timecode` for each of the four drones. The `trim_frame` index directly addresses frames in the corresponding `drone_N.mp4` trimmed videos. **Darwin Core Event Tables:** `data/video_events.csv` (11 rows × 32 cols) — one row per video clip. Key columns: `eventID`, `parentEventID`, `eventDate`, `eventTime`, `endTime`, `eventDurationSeconds`, `decimalLatitude`, `decimalLongitude`, `footprintWKT`, `samplingProtocol`, `samplingEffort`, `dynamicProperties` (JSON with `droneId`, `aircraftModel`, mean AGL height, battery state, GPS quality, sync method). `data/session_events.csv` (1 row × 51 cols) — one row for the full mission. Extends the video-event columns with Humboldt Eco fields: `eco:inventoryTypes`, `eco:protocolNames`, `eco:protocolDescriptions`, `eco:targetTaxonomicScope`, `eco:samplingPerformedBy`, `eco:siteCount`, etc. **Raw Videos** (`data/raw/mission_1/<drone_id>/<video_id>.{MP4,SRT,LRF}`): Original 4K footage as recorded on-board, with DJI sidecar files: - **MP4** — H.264/H.265-encoded 4K (3840 × 2160) video at ~30 fps - **SRT** — DJI subtitle telemetry file; one entry per frame with GPS, altitude, camera settings, and UTC timestamp (source for the occurrence CSVs) - **LRF** — DJI low-resolution proxy file (~480p); useful for fast preview and frame-level browsing without decoding the full 4K stream File naming follows the DJI convention: `DJI_YYYYMMDDHHMMSS_NNNN_D` where `NNNN` is the clip index on the SD card. **Raw Flight Log** (`flight_logs/mission_1/raw.csv`, 34,954 rows × 62 cols): All four drone flight logs concatenated. One row per ~100 ms flight-controller tick. `logFile` column identifies the source drone. Contains all 60 flight-log columns described above, plus `logFile` and `dateTime` (ISO 8601 UTC string). ### Data Fields Key field groups: **🌿 Darwin Core Event Fields** (`data/video_events.csv`, `data/session_events.csv`): - `eventID`, `parentEventID`, `eventDate`, `eventTime`, `endTime`, `eventDurationSeconds` - `decimalLatitude`, `decimalLongitude`, `footprintWKT` (bounding polygon in WKT) - `samplingProtocol`, `samplingEffort`, `locationID`, `countryCode`, `habitat` - `dynamicProperties` JSON with per-drone metadata (drone ID, aircraft model, mean AGL height, battery start/end %, mean GPS satellite count, sync method) - Session events additionally include Humboldt Eco extensions (`eco:inventoryTypes`, `eco:protocolNames`, `eco:targetTaxonomicScope`, `eco:samplingPerformedBy`, etc.) **📍 Geolocation** (occurrence files): - `latitude` / `longitude` (WGS84, from SRT) — primary per-frame GPS - `fl_latitude` / `fl_longitude` / `fl_altitude` (from flight log, cross-check) - `rel_alt` — relative altitude above takeoff point (m) - `abs_alt` — absolute altitude above sea level (m) - `fl_height` — barometric height AGL (m) - `homeLat` / `homeLon` / `homeAlt` — home-point GPS **📷 Camera Metadata** (occurrence files): - `iso`, `shutter`, `fnum`, `ev`, `color_md`, `focal_len`, `ct` - All derived from the DJI SRT frame metadata block **✈️ Flight Dynamics** (occurrence files, flight-log-derived): - `xSpeed`, `ySpeed`, `zSpeed` and their maximums — body-frame velocity (m/s) - `pitch`, `roll`, `yaw` — aircraft attitude (degrees) - `gimbalPitch`, `gimbalRoll`, `gimbalYaw` and limit flags - `gpsNum`, `gpsLevel` — GPS quality indicators - `flycState`, `flycCommand`, `flightAction` — flight controller state machine - `flyTime` — seconds since takeoff **🔋 Battery and Systems** (occurrence files, flight-log-derived): - `batteryPercent`, `voltage`, `batteryCurrent`, `batteryCurrentCapacity`, `batteryFullCapacity` - `batteryCellNum`, `batteryCellVoltages`, `batteryCellVoltageDev`, `batteryTemp`, `batteryTempMin`, `batteryTempMax` - `rcUplinkSignal`, `rcDownlinkSignal`, `rcAileron`, `rcElevator`, `rcThrottle`, `rcRudder` - `isPhoto`, `isVideo`, `sdCardInserted`, `sdCardState` - `homeHeightLimit`, `homeGoHomeHeight` **🔀 Synchronisation** (`data/sync/`, `data/trimmed/`): - `sync_utc_ms` — universal UTC epoch milliseconds join key, present in all files - `synchronized_frames.csv` — full cross-drone correspondence (21,747 rows, 93 cols, all SRT fields per drone) - `synchronized_trim.csv` — compact trim-frame index (18,292 rows, 14 cols) directly indexing the trimmed MP4s ### Data Splits This dataset has no pre-defined train/val/test splits. ## Platform and Mission Specifications ### 🚁 Platform Details **Type:** UAV (Unmanned Aerial Vehicle) **Hardware:** - **Platform:** DJI Mini 4 Pro - Max flight time: ~34 minutes - Wind resistance: Beaufort 5 (up to ~10 m/s) - Number of platforms: 4 (simultaneous coordinated operation) **Autonomy:** - Mode: Autonomous flight with GPS stabilisation - Navigation: Multi-drone coordination based on groundstation control - Collision avoidance: Obstacle detection enabled - Return-to-home: Automatic on signal loss ### 📷 Sensor Specifications **Primary Sensor: DJI Integrated 1/1.3″ CMOS Camera** - Type: RGB - Resolution: 3840 × 2160 pixels (4K) - Frame rate: ~30 fps (nominal) - Bit depth: 8-bit - Format: MP4 video (H.264/H.265) **Telemetry Included:** - GPS coordinates per frame (DJI SRT sidecar files, ~33 ms intervals) - Camera settings (ISO, shutter, aperture, focal length, exposure value, colour temperature) - Full flight-controller log (62 columns, ~100 ms ticks) - UTC millisecond timestamp for cross-drone synchronisation (`sync_utc_ms`) ### 🗺️ Mission Parameters **Flight Specifications:** - Altitudes: 30, 45, 60, and 75 m AGL (one fixed altitude per drone) - Speed: 0–10 m/s - Flight pattern: Autonomous flight following the algorithm described in the associated paper (Rolland et al., 2025) - Common recording window: 723 s (~12 min) - Total clips: 11 (across 4 drones; some drones have more clips due to SD card splits) **Environmental Conditions:** - Season: Dry season (February) - Weather: Cloudy - Location: Ol Pejeta Conservancy, Laikipia County, Kenya - Elevation: ~1,941 m ASL - Habitat: African savanna / open grassland - Time of day: Daytime ### 🔍 Sampling Protocol **Survey Design:** - Coordinated four-drone swarm; each drone assigned a different altitude for complementary multi-perspective coverage - Focal group follow: 2 single plains zebra herd tracked continuously during the session - Continuous video recording at 4K/~30 fps **Flight Operations:** - Licensed drone operators with Kenya Civil Aviation Authority approval - Four licensed drone operators supervising the autonomous flight of the 4 drones - Animals monitored for disturbance response **Data Collection:** - GPS telemetry embedded in DJI SRT sidecar files (one file per video clip) - DJI binary flight logs (v14) decoded via DJI Open Platform API **Quality Control:** - Field notes recorded for the session - Cross-drone frame match rate: 100% within 100 ms tolerance - Per-clip visual inspection of video quality ## Dataset Creation ### Curation Rationale This dataset was created to address two key research questions: 1. **How can coordinated drone swarms provide complementary multi-perspective coverage of wildlife?** By assigning each drone a fixed altitude (30–75 m AGL), the dataset simultaneously captures a census bird's-eye viewpoint and closer identification viewpoints, enabling analysis of the trade-offs between altitude, resolution, and field of view. 2. **What are the technical requirements for temporal and spatial synchronisation across independent drone platforms?** The pipeline documents and resolves the challenges of cross-drone clock alignment, SRT timezone ambiguity, and DJI v14 flight-log encryption, providing a fully reproducible open-source workflow. The dataset fills a critical gap: most drone wildlife datasets contain single-perspective video; multi-drone synchronised datasets with complete flight-controller telemetry and open processing pipelines are rare. This dataset and its pipeline serve as the primary case study for the FAIR² Drones standard. ### Source Data #### Data Collection and Processing **Field Collection:** 1. **Planning:** - Site selected based on known plains zebra population in the open savanna of Ol Pejeta Conservancy - Four DJI Mini 4 Pro drones assigned altitudes of 30, 45, 60, and 75 m AGL - Operators briefed - Flights conducted during daylight hours 2. **Collection:** - Operators located a focal zebra group - Drones launched and ascended to assigned altitudes - Simultaneous recording triggered by the groundstation - Continuous 4K video and SRT telemetry recorded during the session - DJI binary flight logs recorded automatically on-board and transferred post-flight 3. **Post-Processing:** - SRT files parsed to per-frame occurrence CSVs with UTC millisecond timestamps - DJI v14 binary flight logs decrypted via the DJI Open Platform API and decoded with `dji-log-parser-js` - Cross-drone frame alignment on `sync_utc_ms` using nearest-neighbour matching - Occurrence CSVs enriched with 62 flight-log columns - Video clips trimmed and concatenated to the common recording window using `ffmpeg` - Darwin Core event tables generated - HuggingFace-ready dataset assembled **Software and Tools Used:** - Flight control: DJI RC-N1 controller + DJI Fly app 5.17.0 (Android) - Video capture: DJI Mini 4 Pro onboard recording - DJI binary log decoder: `dji-log-parser-js` + custom Node.js script - DJI Open Platform API: keychain decryption for v14 logs - Telemetry parsing and enrichment: Python (`pandas`, `numpy`) - Darwin Core event builder: custom Python scripts (this repository) - Video trimming: `ffmpeg` ### Annotations This is a **raw telemetry dataset** with no animal detection boxes, track identities, or behaviour labels. All telemetry fields (GPS, camera settings, attitude, battery, gimbal) are automatically derived from on-board sensors and require no manual annotation. The synchronisation table (`data/sync/`) links frames across drone views but does not include any manual labels. Researchers wishing to add annotations (detection boxes, identities, behaviours) can use tools such as CVAT and align annotations to frames via `sync_utc_ms`. ### Personal and Sensitive Information **Privacy and Security Considerations:** **Human Subjects:** - Flights conducted in a managed conservancy away from public areas **Wildlife and Location:** - Target species *Equus quagga* (plains zebra) is not endangered (IUCN: Least Concern) - Location corresponds to a well-managed, access-controlled conservancy (Ol Pejeta) - Full GPS coordinates included to support scientific replication **Security:** - No security concerns - Data collected in coordination with Ol Pejeta Conservancy management ## Considerations for Using the Data ### Dataset Statistics **Survey Summary:** | Property | Value | |---|---| | Session date | 2026-02-24 | | Location | Ol Pejeta Conservancy, Laikipia, Kenya | | GPS bounding box | −0.0071 to −0.0058 N, 36.8705 to 36.8737 E | | Elevation (ASL) | ~1,942 m (min 1,941.86 m, max 1,980.55 m) | | Target species | Plains zebra (*Equus quagga*) | | Aircraft | DJI Mini 4 Pro × 4 (simultaneous) | | Drone altitudes (AGL) | 30, 45, 60, 75 m (one per drone) | | Full session duration | 727.2 s (~12 min) | | Common trimmed window | 609.7 s (~10 min) | | Video clips | 11 | | Raw video files | 11 MP4 (4K, ~38 GB total) + 11 SRT + 11 LRF | | Total telemetry frames | 86,985 | | Cross-drone sync rows (full) | 21,747 | | Trimmed sync rows | 18,292 | | Raw flight-log rows | 34,954 | | Frame rate | ~30 fps | | Synchronisation method | Per-frame UTC from SRT, cross-validated against `isVideo` edge (offset < 1.3 s) | ### Bias, Risks, and Limitations **⚠️ Known Biases:** 1. **Geographic Bias:** - Data from a single site (Ol Pejeta Conservancy, Laikipia) - May not generalise to other savanna ecosystems or terrain types 2. **Temporal Bias:** - Single-day survey (2026-02-24, dry season) - No seasonal variation or multi-day coverage - Daytime flights only; nocturnal behaviour not captured 3. **Species Bias:** - Single species (*Equus quagga*, plains zebra) - Only the two herds are consistently in frame; background animals may be partially visible 4. **Environmental Bias:** - Dry season conditions; vegetation cover may differ from wet season - Open grassland terrain; performance in denser vegetation is untested **Technical Limitations:** - **No annotations:** No detection boxes, track identities, or behaviour labels are included - **GPS accuracy:** ±5 m typical - **SRT timezone assumption:** Processing assumes the device clock was set to UTC+1 (CET); different field timezones require reconfiguration in `config.yaml` - **Flight-log encryption:** DJI v14+ logs require API decryption; API availability is subject to DJI policy changes (mitigated by cached `.keychain.json` sidecar files) - **Clip boundaries:** Each drone records multiple sequential clips; raw clip indices are not globally continuous across clips (handled by the pipeline) ### Recommendations **Best Practices for Using This Dataset:** 1. **For Multi-View Geometry / 3D Reconstruction:** - Use `sync_utc_ms` as the universal join key; do not rely on raw per-clip frame indices across drones - Gimbal angles (`gimbalPitch`, `gimbalRoll`, `gimbalYaw`) provide orientation; combine with GPS for camera extrinsics 2. **For Individual Re-identification:** - Lower-altitude drones (30, 45 m) provide higher-resolution flank markings - Higher-altitude drones (60, 75 m) offer wider field of view for group-level context 3. **For Ecological Analysis:** - Dataset represents a single dry-season session; do not extrapolate to year-round or cross-site statistics without additional data 4. **For Reproducing the Pipeline:** - Set the processing machine to UTC+1 or configure the timezone offset in `config.yaml` - Cache `.keychain.json` sidecar files under version control to avoid API dependency **What This Dataset Should NOT Be Used For:** - Estimating absolute population sizes (non-systematic, single-session sampling) - Generalising behaviour or detection performance to other sites, seasons, or species without additional validation ## Licensing Information **Dataset License:** [CC BY 4.0 (Creative Commons Attribution 4.0 International)](https://creativecommons.org/licenses/by/4.0/) **Citation Requirement:** Please cite the dataset and the associated paper if you use this data (see [Citation section](#citation)). **Code License:** MIT License for scripts in this repository ## Citation **If you use this dataset, please cite:** **Associated Paper:** ```bibtex @InProceedings{10.1007/978-3-032-07638-0_22, author = {Rolland, Edouard G. A. and Meier, Kilian and Gr{\o}ntved, Kasper A. R. and Laporte-Devylder, Lucie and Maalouf, Guy and Lundquist, Ulrik P. S. and Christensen, Anders L.}, editor = {Mathieu, Philippe and De la Prieta, Fernando}, title = {Drone Swarms for Multi-perspective Monitoring of Large Mammals in their Natural Habitats: Deployment and Field Trials}, booktitle = {Advances in Practical Applications of Agents, Multi-Agent Systems, and Computational Social Science: The PAAMS Collection}, year = {2026}, publisher = {Springer Nature Switzerland}, address = {Cham}, pages = {266--277}, isbn = {978-3-032-07638-0} } ``` **Dataset:** ```bibtex @dataset{multi_perspective_zebras_2026, author = {Rolland, Edouard and Afridi, Saadia and Bullock, Steve and {Jarabo Penas}, Alejandro and {Laporte Devylder}, Lucie}, title = {Multi-Perspective Dataset of Plains Zebras}, year = {2026}, url = {https://huggingface.co/datasets/edouard-rolland/multi-perspective-dataset-plain-zebras}, license = {CC-BY-4.0}, note = {Ol Pejeta Conservancy, Kenya. 4 × DJI Mini 4 Pro, synchronised multi-drone aerial survey.} } ``` **FAIR² Drone Data Standard:** ```bibtex @article{kline2025fair2, title = {Toward a FAIR² Standard for Drone-Based Wildlife Monitoring Datasets}, author = {Kline, Jenna and others}, year = {2025}, note = {In preparation} } ``` ## Acknowledgements This work is supported by the **WildDrone MSCA Doctoral Network** funded by EU Horizon Europe under grant agreement no. 101071224, and by the **Innovation Fund Denmark** for the project DIREC (9142-00001B). We thank: - **Ol Pejeta Conservancy** for site access and logistical support - **Data Collection Team:** - Saadia Afridi - Steve Bullock - Alejandro Jarabo Penas - Lucie Laporte Devylder - Elzbieta Pastucha ## Validation and Quality Metrics **🤖 AI-Readiness Validation:** - [x] Machine-readable metadata (YAML front matter complete) - [x] Structured telemetry in Darwin Core format - [ ] Train/val/test splits pre-defined (users should create) - [x] Data loading code provided (Python pipeline scripts) - [ ] Example notebooks (planned) **🌿 Darwin Core Validation:** - [x] Event records complete and valid (11 video events + 1 session event) - [x] Occurrence records complete and valid (86,985 frames across 11 clips) - [x] Scientific names validated against GBIF backbone - [x] Coordinates in WGS84 - [x] Sampling protocol documented - [ ] GBIF dataset registration (planned) **⚠️ FAIR² Compliance Checklist:** - [ ] **Findable:** DOI to be assigned - [x] **Accessible:** Open access via HuggingFace (CC-BY-4.0) - [x] **Interoperable:** Darwin Core, WGS84, ISO 8601, CSV/JSON formats - [x] **Reusable:** CC-BY-4.0 license, full provenance and pipeline documented - [x] **AI-Ready:** Machine-readable, structured, versioned ## Code and Tools **Data Loading (Python):** ```python import pandas as pd # Load session-level events sessions = pd.read_csv('data/session_events.csv') # Load video-level events videos = pd.read_csv('data/video_events.csv') # Load occurrence records for a specific drone clip occurrences = pd.read_csv( 'data/occurrences/mission_1/drone_1-DJI_20260224133555_0001_D.csv' ) # Load the full cross-drone synchronisation table sync = pd.read_csv('data/sync/mission_1/synchronized_frames.csv') # Load the compact trim-frame index (references trimmed MP4s directly) sync_trim = pd.read_csv('data/trimmed/mission_1/synchronized_trim.csv') # Load the raw flight logs flight_logs = pd.read_csv('flight_logs/mission_1/raw.csv') # Get all four drone frames aligned to a given moment sync_key = occurrences.loc[occurrences['frame'] == 100, 'sync_utc_ms'].values[0] aligned = sync[sync['sync_utc_ms'].between(sync_key - 50, sync_key + 50)] ``` **Processing Scripts:** See `pipeline/` for: - `parse_srt.py` — Parse DJI SRT files into per-frame occurrence CSVs with UTC timestamps - `sync.py` — Cross-drone frame alignment on `sync_utc_ms` using nearest-neighbour matching - `trim.py` — Trim and concatenate video clips to the common recording window via `ffmpeg` - `flight_logs.py` — Decrypt and decode DJI v14 binary flight logs - `enrich.py` — Join 62 flight-log columns to occurrence CSVs - `events.py` — Generate Darwin Core video and session event tables - `assemble.py` — Assemble the HuggingFace-ready dataset directory Run everything with: ```bash python run_pipeline.py ``` See the [Quickstart](#quickstart) section for full usage instructions. --- ## Quickstart ### Requirements - Python 3.10+ - Node.js 12+ (for DJI flight log decoding) - A DJI Open Platform API key — free at <https://developer.dji.com/flight_logs/> (only needed for v13+ encrypted logs; skip if `.keychain.json` sidecars are already present) ### Installation ```bash git clone <repo-url> cd fair_drone_data_standard python -m venv .venv && source .venv/bin/activate pip install pandas numpy requests pyyaml npm install # installs dji-log-parser-js and node-fetch ``` ### Add your data Drop your mission folders into `missions/` following this layout: ``` missions/ mission_1/ drones_videos/ drone_1/ ← DJIFlightRecord .SRT / .MP4 / .LRF files drone_2/ ... flight_logs/ drone_1/ ← DJIFlightRecord_*.txt (and optional *.keychain.json sidecars) drone_2/ ... mission_2/ ... ``` ### Configure Open `config.yaml` and fill in your dataset identity, species, location, and platform fields. Every option is documented inline. ```yaml dataset: name: my-dataset-name institution: "My Institution" taxon: scientificName: "Equus quagga" location: country: Kenya locality: "Ol Pejeta Conservancy" platform: aircraftModel: "DJI Mini 4 Pro" ``` ### Run ```bash export DJI_API_KEY="your_key_here" # or skip if keychains already cached python run_pipeline.py # processes all missions ``` **Selective options:** ```bash python run_pipeline.py --missions mission_1 # single mission python run_pipeline.py --skip-flight-logs # reuse existing raw.csv python run_pipeline.py --skip-assemble # skip dataset copy step python run_pipeline.py --config path/to/config.yaml # custom config ``` ### Outputs ``` output/ mission_1/ occurrences/ ← per-frame SRT telemetry CSVs (one per clip) sync/ ← cross-drone synchronised frame table flight_logs/raw.csv ← decoded DJI binary logs occurrences_enriched/ ← per-frame CSVs + flight-log columns joined events/ ← Darwin Core video_events.csv + session_events.csv dataset_output/ ← assembled HuggingFace-ready directory ``` --- ## Glossary - **AGL:** Above Ground Level — altitude measured from terrain surface - **ASL:** Above Sea Level — absolute altitude - **Darwin Core:** Biodiversity data standard maintained by TDWG - **FAIR²:** FAIR principles extended for AI-ready and drone-specific datasets - **SRT:** SubRip subtitle format; used by DJI for embedding per-frame telemetry sidecar files - **sync_utc_ms:** Unix epoch milliseconds in UTC — universal synchronisation key across all drones and data streams - **TDWG:** Biodiversity Information Standards (Taxonomic Databases Working Group) - **UAV:** Unmanned Aerial Vehicle (drone) - **WGS84:** World Geodetic System 1984 — standard GPS coordinate reference system - **WKT:** Well-Known Text format for geographic geometries ## Dataset Card Authors Edouard Rolland ## Dataset Card Contact For questions about this dataset: - **Primary Contact:** Edouard Rolland - **GitHub:** [@edouardrolland](https://github.com/edouardrolland) - **Issues:** [GitHub repository issues](https://github.com/edouard-rolland/multi-perspective-dataset-plain-zebras/issues) - **HuggingFace:** <https://huggingface.co/datasets/edouard-rolland/multi-perspective-dataset-plain-zebras> --- **Version History:** - v1.0.0 (2026-03-23): Initial release --- *This dataset card follows the FAIR² Drone Data Standard and is modelled on the [KABR Behavior Telemetry dataset card](https://huggingface.co/datasets/imageomics/kabr-behavior-telemetry).*
提供机构:
edouard-rolland
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作