CUDA robotics algorithms and research extensions

CudaRobotics Results Snapshot

CUDA C++ implementations of classic and research-style robotics algorithms. Includes differentiable MPPI with matched-time evaluation, neural SDF navigation, GPU neuroevolution, parallel CartPole simulation, point-cloud processing, and swarm optimization. All benchmarks run on a single consumer GPU.

GitHub Repository README

dynamic_slalom @ 1.0 ms

1.91

Diff-MPPI dist (6 baselines fail)

7-DOF K=512

0.84 ms

Diff-MPPI success=1.00

Point cloud (10K pts)

3,171x

Normal estimation speedup

Hardware

Single GPU

Consumer desktop, no multi-GPU

At a glance

Current takeaways

Updated from the repository state on 2026-04-05.

Diff-MPPI on dynamic navigation

At matched 1.0 ms budget on dynamic_slalom, diff_mppi_3 is the only successful controller (dist 1.91) across 6 non-hybrid baselines. The strongest non-hybrid baseline feedback_mppi_fused reaches 10.30, while vanilla MPPI stays at 14.16.

Main evidence: matched-time dynamic obstacle suite with 6 baselines

7-DOF manipulator

On 7dof_dynamic_avoid at K=512, diff_mppi_3 reaches success=1.00 at 0.84 ms, while feedback_mppi_ref reaches 0.75 at 4.01 ms. The hybrid controller is 4.8x faster and more reliable on this 14-dimensional manipulation task.

High-DOF evaluation: Panda-like 7-DOF arm with 3D obstacles

Faithful baseline analysis

A two-rate feedback_mppi_faithful variant combining released current-action gains with stride-2 replanning fails even at K=8192 (2.1 ms), confirming that the autodiff refinement provides value that pure feedback cannot replicate.

Baseline architecture study: current-action-only vs hybrid

Neural SDF navigation

The repository now learns 2D signed distance fields with a GPU MLP and uses them for both potential-field planning and MPPI rollouts on non-circular obstacle layouts.

Side-by-side GIFs against circle-based approximations

GPU learning and optimization

GPU REINFORCE improves MiniIsaac CartPole survival from about 82.6 to 180.4 steps, while neuroevolution and swarm solvers run thousands of candidates in parallel.

MiniIsaacGym, neuroevolution, PSO, DE, CMA-ES, ACO

Main result set

Diff-MPPI

The strongest current story is the hybrid MPPI plus short autodiff refinement line.

Benchmark	Comparison	Current result
`dynamic_slalom`, matched `1.0 ms`	`mppi` vs 5 feedback baselines vs `diff_mppi_3`	Only diff_mppi_3 succeeds (`1.91`). Best non-hybrid: `feedback_mppi_fused` at `10.30`.
`7dof_dynamic_avoid`, `K=512`	`diff_mppi_3` vs `feedback_mppi_ref`	`diff_mppi_3`: success=1.00 @ 0.84 ms. `feedback_mppi_ref`: 0.75 @ 4.01 ms.
`dynamic_crossing`, matched `1.0 ms`	`mppi` vs `feedback_mppi_ref` vs `diff_mppi_3`	`3.02` vs `1.93` vs `1.85` final distance
`arm_static_shelf`, `K=256`	`mppi` vs `feedback_mppi_ref`	`0.00 / 0.23` vs `1.00 / 0.15` success / final distance
`dynamic_bicycle`, low budget	`dynbike_slalom K=32`	`mppi`: `0.75 / 12.60`, `diff_mppi_1`: `1.00 / 2.24`

MPPI and Diff-MPPI comparison — Side-by-side rollout comparison between vanilla MPPI and Diff-MPPI.

Differentiable MPPI trajectory rollouts — Trajectory rollouts from the differentiable refinement controller.

Other research additions

Neural SDF, MiniIsaac, Point Clouds, Swarm

These are useful repository-level results even though Diff-MPPI is the current main research thread.

Neural SDF Navigation

GPU MLP SDF learning, potential-field planning, and MPPI on non-circular obstacles.

SDF Quality

Learned signed distance field against the analytic reference field.

Parallel CartPole RL

Custom GPU-parallel CartPole environment with REINFORCE. Survival improves from 82.6 to 180.4 steps in 160 generations.

MiniIsaac Parallel Sim

Thousands of nonlinear CartPole environments simulated in parallel on GPU.

GPU Neuroevolution

Parallel evolution of 4096 neural policies with replay and learning-curve comparisons.

Swarm Optimization

GPU PSO, DE, CMA-ES, and ACO with animated convergence comparisons.

Point-cloud snapshot

CudaPointCloud

Multi-scale benchmark (1K–100K points). Both CPU and GPU use the same brute-force algorithms (no KD-trees). GPU times include device sync. Supports --ply, --kitti, --xyz file input.

Points	Operation	CPU	GPU	Speedup
1,000	Voxel Grid	0.67 ms	1.76 ms	0.4x (GPU loses)
5,000	Normal Estimation	4,024 ms	2.08 ms	1,933x
10,000	Normal Estimation	15,487 ms	4.88 ms	3,171x
50,000	RANSAC Plane	1,518 ms	2.78 ms	546x
100,000	RANSAC Plane	3,077 ms	5.62 ms	547x