DDS / large-message tuning
LiDAR SLAM publishes large sensor_msgs/PointCloud2 messages (a single
Ouster OS0-128 scan is 6 MB+). Out of the box, ROS 2's default DDS settings
are not tuned for messages this large, which can cause dropped scans and
node-to-node delivery delay. This page explains the failure mode, the default
that avoids it, and the config to set when you do need online streaming.
The failure mode
- A 6 MB+ PointCloud2 is fragmented across many UDP datagrams. With the default receive-buffer / fragment-reassembly limits, fragments are dropped under load and the subscriber never assembles a complete message.
- Symptoms: the front-end logs "Subscribed …" but never advances; scans are
silently dropped; nodes that should coexist (e.g. a streaming odometry node
alongside
graph_based_slam) starve. - This is a transport problem, not a SLAM problem — the same bag replays fine when read in-process.
Default that avoids it: the offline node reads the bag in-process
The recommended public path (RKO-LIO + graph_based_slam) uses the offline node, which reads the rosbag2 internally rather than subscribing to a streamed topic. There is no large-PointCloud2 DDS hop, so the drop/delay class above does not occur. This is why the benchmark and map authoring scripts drive the offline node — it is the robust default, not a limitation.
If you only run the documented benchmark / map-authoring flows, you do not need any of the DDS tuning below.
When you stream online: tune CycloneDDS + the kernel
If you must run a streaming pipeline (e.g. live sensor → online SLAM, or the
AWSIM × Autoware demo where AWSIM publishes the cloud over DDS), set both a
CycloneDDS profile and the kernel socket/fragment limits. The values below
are the ones validated in the AWSIM × Autoware demo
(docs/awsim-autonomous-driving-tutorial.md) and are a sane starting point for
loopback / single-host streaming.
CycloneDDS profile
cat > ~/cyclonedds.xml << 'EOF'
<?xml version="1.0" encoding="UTF-8" ?>
<CycloneDDS xmlns="https://cdds.io/config">
<Domain Id="any">
<General>
<Interfaces>
<NetworkInterface name="lo" priority="default" multicast="default" />
</Interfaces>
<AllowMulticast>default</AllowMulticast>
<MaxMessageSize>65500B</MaxMessageSize>
</General>
<Internal>
<SocketReceiveBufferSize min="10MB"/>
<Watermarks><WhcHigh>500kB</WhcHigh></Watermarks>
</Internal>
</Domain>
</CycloneDDS>
EOF
export RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
export CYCLONEDDS_URI=file://$HOME/cyclonedds.xml
SocketReceiveBufferSize(10 MB) is the single most important knob for large PointCloud2 — it gives the kernel room to hold fragments until reassembly.- The
lointerface +AllowMulticastsettings target single-host / loopback streaming. For a real multi-host robot network, set the interface to your NIC and size the buffers to the link.
Kernel socket / fragment limits (requires sudo)
sudo sysctl -w net.core.rmem_max=2147483647
sudo sysctl -w net.ipv4.ipfrag_time=3
sudo sysctl -w net.ipv4.ipfrag_high_thresh=134217728
sudo ip link set lo multicast on
net.core.rmem_maxmust be ≥ the CycloneDDSSocketReceiveBufferSize, or the DDS request to enlarge the socket buffer is silently clamped.ipfrag_*raise the IP fragment-reassembly budget so 6 MB clouds reassemble under load instead of being dropped.
Persist these in /etc/sysctl.d/ for a real deployment rather than setting them
per shell.
Intra-process composition
Composing the SLAM nodes into one process with intra-process communication sidesteps DDS for in-process hops entirely (zero-copy pointer passing). This is the most robust answer for an online pipeline on a single host, but the public launch files currently run nodes as separate processes. Intra-process composition + FastDDS shared-memory / zero-copy transport is tracked as a future hardening item; until it ships, prefer the offline node (default) or the CycloneDDS + kernel tuning above for streaming.
Quick reference
| Situation | What to do |
|---|---|
| Benchmark / map authoring (documented flows) | Nothing — the offline node reads the bag in-process |
| Live sensor → online SLAM, single host | CycloneDDS profile + kernel sysctl above |
| Multi-host robot network | Same, but size SocketReceiveBufferSize / rmem_max to the link and bind the real NIC |
| Dropped scans despite tuning | Verify rmem_max ≥ CycloneDDS buffer; confirm RMW_IMPLEMENTATION / CYCLONEDDS_URI are exported in the node's environment |