Localization & Perception

scene-db
Find the failures hiding in your driving logs

Ingest KITTI or nuScenes data, chunk by time, extract features — then search for the exact edge cases that break your localization or perception stack.

$ pip install scene-db

The problem

You have terabytes of rosbag / dataset logs. Somewhere in there are the 30-second clips where your system failed — but finding them means scrubbing through hours of data.

Localization drift after GPS outage

Your EKF diverged under a tunnel, but the log is 2 hours long. Which 10 seconds matter?

Missed detection at dusk

Perception worked in daylight but silently dropped detections as lighting changed. Where exactly?

Sudden stop = which scene?

The vehicle braked hard. Was it a false positive from the planner, or a real obstacle the detector caught late?

"It works on my data"

You need to share the exact failure scene with a teammate — not a 50GB bag file.

Edge cases you can surface

scene-db chunks your logs, computes motion features, and generates searchable captions. Here are real queries your team can run:

All Localization Perception
Localization

High yaw rate — IMU heading drift

Sharp turns stress the EKF heading estimate. IMU gyro bias and wheel slip both accumulate here. Found 30+ deg/s peaks in KITTI and PPC data.

$ scene-db search --min-yaw 20 --sort yaw
Localization

High speed — GPS latency amplified

At 77 km/h, a 50ms GPS delay = 1.1m position error. LiDAR scan distortion from ego-motion is also maximal.

$ scene-db search --min-speed 60 --sort speed
Localization

Low-speed maneuvering — dead-reckoning drift

Parking, U-turns, crawling in traffic. Wheel odometry resolution limits dominate. GPS multipath in urban canyons.

$ scene-db search --max-speed 5 --sort speed
Localization

Start from stop — GNSS reacquisition

After standstill, initial velocity is noisy. GNSS often reacquires with a position jump. IMU integration restarts.

$ scene-db search "stationary"
Localization

Loop closure — place revisit

Does the trajectory return to its starting point? scene-db detects loops and counts revisits for SLAM evaluation.

$ scene-db sequences
Perception

Hard braking — pitch shift

Braking tilts the vehicle forward. LiDAR FOV shifts, camera horizon drops. Detected 2.5+ m/s² events in KITTI.

$ scene-db search --min-decel 2.0 --sort decel
Perception

Approach to stop — tracking handoff

Transitioning from motion-based to static perception. Object trackers often drop detections in this zone.

$ scene-db search "slowly" --min-decel 1.0
Perception

Scenes with poor visibility

Low contrast, overexposure, rain on lens. VLM captions catch what rule-based features miss.

$ scene-db search -s "dark road with glare"
Perception

Crowded intersections

Multiple pedestrians, cyclists, turning vehicles — the combinatorial explosion that detectors struggle with.

$ scene-db search -s "busy intersection pedestrians"
Loc + Per

Yaw + brake combo — multi-axis stress

Simultaneous turning and braking. All sensor modalities stressed: IMU coupling, wheel slip, pitch + yaw change.

$ scene-db search --min-yaw 10 --min-decel 1.0
Loc + Per

Tunnels & underpasses

GPS denied + lighting transition. Localization falls back to LiDAR/IMU while cameras adapt to darkness.

$ scene-db search -s "tunnel entrance dark"
Loc + Per

Automatic detection

Don't know what to search for? Let scene-db find edge cases automatically with rule-based heuristics.

$ scene-db edge-cases --severity critical

Workflow: find → search → export → fix

Ingest once, query forever. Export just the frames you need for debugging or retraining.

scene-db
# 1. Ingest a KITTI sequence (108 frames → 3 chunks)
$ scene-db ingest /data/kitti/2011_09_26_drive_0001_sync
Ingesting ... Done. Created 3 scene chunks.

# 2. Search for low-speed scenes (localization edge case)
$ scene-db search "moving slowly"
Found 1 scene(s):

  [kitti_..._sync_002]
    vehicle moving slowly, 19 km/h, traveled 4.9 m
    frames 98-107

# 3. Semantic search (VLM + embedding)
$ scene-db search -s "vehicle decelerating near parked cars"
Found 2 scene(s):

  [kitti_..._sync_002] (score: 0.847)
    residential street, parked cars on both sides, vehicle slowing down

# 4. Export the scene for debugging
$ scene-db export --id kitti_..._sync_002 -o ./debug_scene
Exported 50 files to ./debug_scene

How it works

Raw Logs

KITTI / nuScenes

Chunk

5-sec time windows

Extract

speed, distance, yaw

Caption

rule-based or VLM

Index

SQLite + embeddings

Search

text or semantic

Built for robotics engineers

No infra required

SQLite on disk. No Docker, no Postgres, no Elasticsearch. Install with pip, run from your terminal.

VLM captioning

Optional GPT-4o integration generates rich scene descriptions from camera images. Falls back to rule-based if no API key.

Semantic search

Embedding-based similarity search with sentence-transformers. Find scenes by meaning, not just keywords.

Export for replay

Extract the exact images, point clouds, and IMU data for a scene. Feed it straight into your replay pipeline.

Multi-dataset

KITTI and nuScenes today. Same SceneChunk model regardless of source. Easy to add your own format.

Four commands

ingest index search export — no YAML configs, no pipeline DSLs, no ceremony.

9 datasets, 2912 scenes, 168K frames

🚗
KITTI
25 seq, cameras+LiDAR+GPS/IMU
🛰
nuScenes
6cam+LiDAR+RADAR
📡
PPC Dataset
GNSS+IMU, Nagoya/Tokyo
🤖
GLIM
Ouster OS1-128+IMU
🏗
AIST Park
Ouster+IMU, decel 11.2 m/s²
🧱
Flatwall
Livox+IMU, LiDAR degeneration
🎒
Cartographer 3D
2x VLP-16+IMU, 20min
🐕
AlienGo
Quadruped, Livox+Cam+IMU
Your format
rosbag or CSV

Which data for your LiDAR SLAM?

Basic sanity check

GLIM os1_128 (491 MB, 115s)
Ouster OS1-128 + IMU. Small, fast to download. Confirm your pipeline runs.

Aggressive dynamics

AIST Park (2.1 GB, 144s)
Max decel 11.2 m/s² across all datasets. Repeated hard braking events (8.4, 7.2, 6.9 m/s²). Tests if your ESKF/EKF tracks through violent acceleration changes. Ouster + IMU.

LiDAR degeneration

Flatwall (306 MB, 33s)
Wall-only environment where LiDAR scan matching degenerates. Without IMU, localization fails. The critical test for IMU-LiDAR coupling.

Long-term drift

Cartographer 3D (9.3 GB, 20min)
20 minutes of continuous operation. IMU-only speed drifts to 6000+ km/h. Your SLAM must prevent this.

Loop closure

PPC Tokyo run1/run2
9.9 km with loop detected (2m closure). 1386 revisits. RTK-GNSS ground truth.

High yaw rate

KITTI drive_0014 / PPC Tokyo run3
Up to 30.2 deg/s. Intersection turns that stress heading estimation.

Quadruped walking — extreme IMU

AlienGo (774 MB, 344s)
Four-legged robot with Livox + T265 camera + IMU. Walking gait creates decel 29,693 m/s² and yaw 45,118 deg/s — 1000x more than any vehicle. The ultimate stress test for IMU preintegration and LiDAR-Visual-Inertial fusion.