← Back to Blog
Robotics2024-11-108 min read

Getting Started with ROS 2 for Robotics Development

#ROS2#Robotics#Python
Loading...

Comprehensive Guide to ROS 2 for Professional Robotics Development

After years of developing AMRs, AGVs, and robotic manipulators in production environments, I've learned that getting started with ROS 2 requires more than just following installation tutorials. This guide distills practical knowledge for building robust, production-ready robotic systems.

Why ROS 2? A Real-World Perspective

Having migrated multiple production systems from ROS 1, I can tell you that ROS 2 isn't just an upgrade—it's a fundamental reimagining of robot middleware. Here's what matters in practice:

The DDS Revolution

ROS 2 uses DDS (Data Distribution Service) as its middleware, replacing ROS 1's custom protocol with an industry-proven standard used in battleships and space systems. This means:

  • No single point of failure: ROS 1's master node is gone. Your robot won't grind to a halt if one process crashes.
  • Real-time capable: DDS provides Quality-of-Service (QoS) profiles for better real-time communication, scalability and security performance
  • Secure by design: DDS provides security guarantees that were not present in ROS 1, critical for commercial deployments
  • Cross-platform: Deploy on Linux, Windows, macOS, and embedded systems without code changes

What Changed from ROS 1

Architecture Improvements:

  • Multiple nodes can run in a single process, enabling flexible deployment-time composition
  • Launch files now use Python for complex logic and conditionals
  • Built-in lifecycle management for nodes
  • True parallel execution on multi-core processors

Communication Model:

  • Distributed discovery (no centralized master)
  • Configurable QoS policies per topic
  • Support for both reliable (TCP-like) and best-effort (UDP-like) communication

Why It Matters for AMRs/AGVs: From my work with warehouse robots, the deterministic real-time response and distributed architecture are game-changers. You can now reliably achieve 2-5ms response times for critical control loops, essential for safe human-robot collaboration.

Installation & Environment Setup

Ubuntu 22.04 (Recommended for Production)

# Set up locale sudo apt update && sudo apt install locales sudo locale-gen en_US en_US.UTF-8 sudo update-locale LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8 export LANG=en_US.UTF-8 # Add ROS 2 repository sudo apt install software-properties-common sudo add-apt-repository universe sudo apt update && sudo apt install curl -y sudo curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.key -o /usr/share/keyrings/ros-archive-keyring.gpg echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] http://packages.ros.org/ros2/ubuntu $(. /etc/os-release && echo $UBUNTU_CODENAME) main" | sudo tee /etc/apt/sources.list.d/ros2.list > /dev/null # Install ROS 2 Humble (LTS) sudo apt update sudo apt install ros-humble-desktop # Install development tools sudo apt install python3-colcon-common-extensions python3-rosdep # Initialize rosdep sudo rosdep init rosdep update

Workspace Setup Best Practices

# Create workspace structure mkdir -p ~/ros2_ws/src cd ~/ros2_ws # Build with optimizations colcon build --symlink-install --cmake-args -DCMAKE_BUILD_TYPE=Release # Add to .bashrc for automatic sourcing echo "source /opt/ros/humble/setup.bash" >> ~/.bashrc echo "source ~/ros2_ws/install/setup.bash" >> ~/.bashrc

Pro Tip: Use --symlink-install during development. It creates symbolic links instead of copying Python files, so you can edit scripts without rebuilding.

Understanding DDS Middleware: A Critical Choice

The default DDS implementation since ROS 2 Galactic is CycloneDDS, but FastDDS and RTI Connext are also supported. Your choice significantly impacts performance.

DDS Middleware Comparison

CycloneDDS (Default):

  • Generally offers better performance in high-throughput scenarios with lower latency
  • Cleaner codebase but less feature-rich
  • Best for: Standard mobile robots, research applications
  • Install: sudo apt install ros-humble-rmw-cyclonedds-cpp

FastDDS:

  • Demonstrated faster performance on Raspberry Pi compared to other middlewares
  • More configuration options and features
  • Native support for 5G network flows and zero-copy delivery
  • Best for: Industrial applications, multi-robot systems, resource-constrained platforms
  • Install: sudo apt install ros-humble-rmw-fastrtps-cpp

Switching DDS Implementations:

# Set environment variable export RMW_IMPLEMENTATION=rmw_cyclonedds_cpp # or export RMW_IMPLEMENTATION=rmw_fastrtps_cpp # Verify ros2 doctor --report | grep middleware

DDS Configuration for Production

For multi-interface robots (WiFi + Ethernet):

CycloneDDS XML:

<CycloneDDS> <Domain> <General> <NetworkInterfaceAddress>eth0,wlan0</NetworkInterfaceAddress> </General> </Domain> </CycloneDDS>

Set: export CYCLONEDDS_URI=file:///path/to/cyclonedds.xml

FastDDS XML:

<profiles> <transport_descriptors> <transport_descriptor> <transport_id>udp_transport</transport_id> <type>UDPv4</type> </transport_descriptor> </transport_descriptors> <participant profile_name="default"> <rtps> <userTransports> <transport_id>udp_transport</transport_id> </userTransports> </rtps> </participant> </profiles>

Set: export FASTRTPS_DEFAULT_PROFILES_FILE=/path/to/fastdds.xml

Quality of Service (QoS): The Secret to Reliable Communication

QoS profiles are where ROS 2 really shines. ROS 2 can be as reliable as TCP or as best-effort as UDP, with many possible states in between.

Essential QoS Policies

Reliability:

  • RELIABLE: Guarantees delivery (like TCP) - use for commands, maps
  • BEST_EFFORT: Fast, may lose messages (like UDP) - use for sensor streams

Durability:

  • TRANSIENT_LOCAL: Publisher persists samples for late-joining subscriptions
  • VOLATILE: No persistence, only current data

History:

  • KEEP_LAST(n): Store last n messages
  • KEEP_ALL: Store all messages (use carefully!)

Practical QoS Patterns

For Sensor Data (Camera, LiDAR):

from rclpy.qos import QoSProfile, ReliabilityPolicy, HistoryPolicy sensor_qos = QoSProfile( reliability=ReliabilityPolicy.BEST_EFFORT, history=HistoryPolicy.KEEP_LAST, depth=1 # Only latest frame matters ) subscription = self.create_subscription( Image, '/camera/image', callback, sensor_qos)

For Maps (SLAM):

map_qos = QoSProfile( reliability=ReliabilityPolicy.RELIABLE, durability=DurabilityPolicy.TRANSIENT_LOCAL, history=HistoryPolicy.KEEP_LAST, depth=1 )

For Control Commands:

cmd_qos = QoSProfile( reliability=ReliabilityPolicy.RELIABLE, history=HistoryPolicy.KEEP_LAST, depth=1 # Only latest command )

Critical Gotcha: Using Reliable QoS with deep history queues causes command backlog and latency. For teleoperation, always use BEST_EFFORT with depth=1.

Essential ROS 2 Packages

Navigation Stack (Nav2)

Nav2 is a production-grade navigation framework trusted by 100+ companies worldwide. It provides path planning, control, localization, and behavior orchestration.

Installation:

sudo apt install ros-humble-navigation2 ros-humble-nav2-bringup

Key Features:

  • Behavior trees for complex task orchestration
  • Support for holonomic, differential-drive, legged, and ackermann base types
  • Pluggable planners, controllers, and recovery behaviors
  • Dynamic obstacle avoidance
  • Waypoint following

Configuration Structure:

  • Global costmap for path planning
  • Local costmap for dynamic obstacle avoidance
  • Planner server (NavFn, Theta*, Smac Hybrid-A*)
  • Controller server (DWB, TEB, Regulated Pure Pursuit)
  • Recovery behaviors (spin, backup, wait)

Manipulation (MoveIt 2)

For robotic arms, MoveIt 2 is the industry standard:

sudo apt install ros-humble-moveit

Capabilities:

  • Motion planning (OMPL, Pilz, CHOMP)
  • Collision checking
  • Kinematics (KDL, IKFast, TRAC-IK)
  • Trajectory execution
  • Perception integration

SLAM Toolbox

For mapping and localization:

sudo apt install ros-humble-slam-toolbox

Modes:

  • Online async SLAM (real-time mapping)
  • Offline SLAM (post-processing)
  • Localization mode (map already exists)
  • Lifelong SLAM (continuous map updates)

Essential Debugging Tools

RViz2 - 3D visualization:

ros2 run rviz2 rviz2

RQt - GUI plugins:

ros2 run rqt_console rqt_console # Log viewer ros2 run rqt_graph rqt_graph # Node graph ros2 run rqt_plot rqt_plot # Real-time plotting

Command-line tools:

ros2 topic list # List all topics ros2 topic echo /topic_name # Print messages ros2 topic hz /topic_name # Check frequency ros2 node info /node_name # Node details ros2 param list # List parameters ros2 service list # List services ros2 bag record -a # Record all topics

Building Your First System: A Practical Example

Let's build a complete publisher-subscriber system with proper error handling and QoS:

Package Creation

cd ~/ros2_ws/src ros2 pkg create --build-type ament_python robot_control \ --dependencies rclpy std_msgs geometry_msgs sensor_msgs

Advanced Publisher Node

import rclpy from rclpy.node import Node from rclpy.qos import QoSProfile, ReliabilityPolicy, HistoryPolicy from geometry_msgs.msg import Twist from sensor_msgs.msg import LaserScan import math class VelocityController(Node): """ Production-ready velocity controller with obstacle avoidance """ def __init__(self): super().__init__('velocity_controller') # Declare parameters with defaults self.declare_parameter('max_linear_vel', 0.5) self.declare_parameter('max_angular_vel', 1.0) self.declare_parameter('obstacle_distance', 0.5) # Get parameters self.max_linear = self.get_parameter('max_linear_vel').value self.max_angular = self.get_parameter('max_angular_vel').value self.obstacle_dist = self.get_parameter('obstacle_distance').value # QoS for sensor data (best effort, latest only) sensor_qos = QoSProfile( reliability=ReliabilityPolicy.BEST_EFFORT, history=HistoryPolicy.KEEP_LAST, depth=1 ) # QoS for commands (reliable, latest only) cmd_qos = QoSProfile( reliability=ReliabilityPolicy.RELIABLE, history=HistoryPolicy.KEEP_LAST, depth=1 ) # Subscribers self.scan_sub = self.create_subscription( LaserScan, '/scan', self.scan_callback, sensor_qos) # Publishers self.cmd_pub = self.create_publisher( Twist, '/cmd_vel', cmd_qos) # State self.min_distance = float('inf') self.obstacle_detected = False # Control loop timer (10 Hz) self.timer = self.create_timer(0.1, self.control_loop) self.get_logger().info('Velocity controller initialized') def scan_callback(self, msg): """Process laser scan data""" try: # Find minimum distance in front sector (±30 degrees) front_angles = 30 start_idx = len(msg.ranges) // 2 - front_angles end_idx = len(msg.ranges) // 2 + front_angles front_ranges = msg.ranges[start_idx:end_idx] valid_ranges = [r for r in front_ranges if msg.range_min < r < msg.range_max] if valid_ranges: self.min_distance = min(valid_ranges) self.obstacle_detected = self.min_distance < self.obstacle_dist else: self.min_distance = float('inf') self.obstacle_detected = False except Exception as e: self.get_logger().error(f'Scan processing error: {str(e)}') def control_loop(self): """Main control loop""" cmd = Twist() if self.obstacle_detected: # Emergency stop cmd.linear.x = 0.0 cmd.angular.z = 0.0 self.get_logger().warn( f'Obstacle detected at {self.min_distance:.2f}m - STOPPING') else: # Normal operation - move forward cmd.linear.x = self.max_linear * 0.5 cmd.angular.z = 0.0 self.cmd_pub.publish(cmd) def main(args=None): rclpy.init(args=args) try: controller = VelocityController() rclpy.spin(controller) except KeyboardInterrupt: pass except Exception as e: print(f'Error: {e}') finally: controller.destroy_node() rclpy.shutdown() if __name__ == '__main__': main()

Launch File (Python)

from launch import LaunchDescription from launch.actions import DeclareLaunchArgument, ExecuteProcess from launch.substitutions import LaunchConfiguration from launch_ros.actions import Node def generate_launch_description(): return LaunchDescription([ # Declare arguments DeclareLaunchArgument( 'max_vel', default_value='0.5', description='Maximum linear velocity' ), # Launch controller node Node( package='robot_control', executable='velocity_controller', name='velocity_controller', output='screen', parameters=[{ 'max_linear_vel': LaunchConfiguration('max_vel'), 'obstacle_distance': 0.5 }], remappings=[ ('/scan', '/robot/scan'), ('/cmd_vel', '/robot/cmd_vel') ] ), # Launch RViz for visualization ExecuteProcess( cmd=['ros2', 'run', 'rviz2', 'rviz2'], output='screen' ) ])

Simulation with Gazebo

Gazebo is integrated with ROS 2 through the ros_gz_bridge package, enabling realistic physics simulation.

Installation

# Gazebo Garden (for Humble) sudo apt install ros-humble-ros-gz sudo apt install ros-humble-gazebo-ros-pkgs

Simple Mobile Robot Simulation

URDF/SDF Model (simplified):

<?xml version="1.0"?> <robot name="simple_agv"> <link name="base_link"> <visual> <geometry> <box size="0.5 0.3 0.1"/> </geometry> </visual> <collision> <geometry> <box size="0.5 0.3 0.1"/> </geometry> </collision> </link> <gazebo> <plugin name="differential_drive" filename="libgazebo_ros_diff_drive.so"> <update_rate>50</update_rate> <left_joint>left_wheel_joint</left_joint> <right_joint>right_wheel_joint</right_joint> <wheel_separation>0.3</wheel_separation> <wheel_diameter>0.1</wheel_diameter> <command_topic>/cmd_vel</command_topic> <odometry_topic>/odom</odometry_topic> </plugin> <plugin name="lidar" filename="libgazebo_ros_ray_sensor.so"> <topic>/scan</topic> <frame_name>lidar_link</frame_name> </plugin> </gazebo> </robot>

Launch Simulation:

# Start Gazebo world ros2 launch gazebo_ros gazebo.launch.py # Spawn robot ros2 run gazebo_ros spawn_entity.py \ -entity simple_agv -file simple_agv.urdf

System Architecture Best Practices

Domain IDs

DDS domains have participant limits (120 for RTI Connext v4.2e+). Use different domain IDs for:

  • Multiple robots: export ROS_DOMAIN_ID=1, export ROS_DOMAIN_ID=2
  • Development vs production: export ROS_DOMAIN_ID=10 (dev)
  • Network isolation: Separate domains prevent cross-talk

Node Composition

For performance-critical applications:

from rclpy.node import Node import rclpy.executors class SensorNode(Node): pass class ControlNode(Node): pass def main(): rclpy.init() executor = rclpy.executors.MultiThreadedExecutor() # Run multiple nodes in one process sensor = SensorNode() control = ControlNode() executor.add_node(sensor) executor.add_node(control) try: executor.spin() finally: executor.shutdown() sensor.destroy_node() control.destroy_node() rclpy.shutdown()

Lifecycle Management

For production systems requiring controlled startup/shutdown:

from rclpy.lifecycle import Node, State, TransitionCallbackReturn class LifecycleRobot(Node): def on_configure(self, state: State): self.get_logger().info('Configuring...') # Initialize resources return TransitionCallbackReturn.SUCCESS def on_activate(self, state: State): self.get_logger().info('Activating...') # Start operations return TransitionCallbackReturn.SUCCESS def on_deactivate(self, state: State): self.get_logger().info('Deactivating...') # Stop operations safely return TransitionCallbackReturn.SUCCESS def on_cleanup(self, state: State): self.get_logger().info('Cleaning up...') # Release resources return TransitionCallbackReturn.SUCCESS

VPN and Remote Operations

For fleet management across networks:

VPN Setup with WireGuard:

# Install WireGuard sudo apt install wireguard # Configure interface sudo nano /etc/wireguard/wg0.conf

Configuration:

[Interface] PrivateKey = <robot_private_key> Address = 10.0.0.2/24 ListenPort = 51820 [Peer] PublicKey = <base_station_public_key> Endpoint = base.example.com:51820 AllowedIPs = 10.0.0.0/24 PersistentKeepalive = 25

DDS Discovery Over VPN:

# Ensure DDS uses VPN interface export CYCLONEDDS_URI='<CycloneDDS><Domain><General><NetworkInterfaceAddress>wg0</NetworkInterfaceAddress></General></Domain></CycloneDDS>'

Production Deployment Checklist

Performance Optimization

  1. Build with optimizations:

    colcon build --cmake-args -DCMAKE_BUILD_TYPE=Release
  2. CPU pinning for real-time nodes:

    taskset -c 0,1 ros2 run my_pkg critical_node
  3. Memory locking:

    sudo setcap cap_ipc_lock+ep /path/to/node

Monitoring and Logging

# Configure logging self.get_logger().set_level(rclpy.logging.LoggingSeverity.INFO) # Use structured logging self.get_logger().info(f'Position: x={x:.3f}, y={y:.3f}') self.get_logger().warn(f'Battery low: {battery_pct}%') self.get_logger().error(f'Motor fault: {error_code}')

Testing

import unittest from launch import LaunchDescription from launch_ros.actions import Node import launch_testing class TestVelocityController(unittest.TestCase): def test_obstacle_avoidance(self): # Integration test pass

Security (SROS2)

# Generate security keys ros2 security create_keystore demo_keys ros2 security create_key demo_keys /my_robot_node # Run with security export ROS_SECURITY_KEYSTORE=demo_keys export ROS_SECURITY_ENABLE=true ros2 run my_pkg my_node

Common Pitfalls and Solutions

  1. QoS Incompatibility: Publishers and subscribers must have compatible QoS profiles. Check with ros2 topic info -v /topic.

  2. DDS Performance: Switch between CycloneDDS and FastDDS if experiencing network issues.

  3. Transform (TF) issues: Always publish at >10Hz. Use ros2 run tf2_tools view_frames to debug.

  4. Memory leaks: Always destroy nodes properly in shutdown handlers.

  5. Clock synchronization: Use use_sim_time:=true in simulation consistently.

Resources and Community

Conclusion

ROS 2 represents a mature, production-ready framework for modern robotics. The transition from ROS 1 requires rethinking your architecture, but the benefits—real-time performance, security, and scalability—make it essential for any serious robotics project.

Start with simulation, master QoS configuration, and build incrementally. The ecosystem is rich, the community is active, and the future of robotics runs on ROS 2.


About the Author: This guide reflects 7+ years of experience deploying ROS-based systems in warehouse automation, autonomous vehicles, and industrial manipulation. The lessons learned here come from real production environments where reliability, performance, and safety are non-negotiable.


Follow Me