Perception

Components that give the robot the ability to become aware of itself
and its environment through sensors.

Introduction

The Perception stack is one of the five high-level components of the Autopilot Inference. It is responsible for processing sensor data and providing the robot with a tangible understanding of its environment. This stack takes raw sensor data and applies various filters and mathematical models to merge and enrich the data. The output of the Perception stack is a set of meaningful information that can be used by other stacks in the robot, such as the Navigation stack and the Cognition stack.

The Perception stack is not responsible for perceiving the robot itself, such as its position, nor is it responsible for understanding its environment, for example, by conducting complex reasoning. Instead, the stack focuses on providing a clear and accurate view of the environment through the processing of sensor data.

The line between the Perception stack and other stacks, such as the Navigation stack, can sometimes be blurred. For example, the use of landmarks to estimate one's position can be seen as a part of the Navigation stack, but it also relies on the processing of sensor data done by the Perception stack. In this case, the Perception stack would be responsible for providing the information about the landmarks, while the Navigation stack would be responsible for using that information to estimate the robot's position.

Capabilities

The capabilities that the perception stack provides have been designed to support the robot in pre-processing of sensor data and in finding its whereabouts in the environment. These capabilities often depend on what information is requested from the other stacks.

Detection of landmarks is a capability that is often needed to initialize and improve the position of the robot for navigation. Hereto, the perception stack has the capability to detect Aruco markers and estimate the position of such markers with respect to the robot's position.

Landmarks

Range measurements, either in 2D or in 3D, are often needed to estimate a position of the robot on a pre-recorded map, for example an occupancy map, or they are needed for range-based SLAM, for example LiDAR odometry or LiDAR SLAM. The perception stack has the capability to filter its 3D scanning LiDAR and remove the body of the robot, the ground plane and outliers, thereby producing a clean 3D pointcloud of the environment and a 2D laserscan.

Pointcloud

Dock tracking is capability that is often needed to control the robot in and out of some dock. The perception stack has the capability to continuously keep track of the position of a dock with respect to the robot, for which one should set a specific Aruco marker that is fixed on the physical dock. The position of this Aruco marker is then continuously published on a goal topic that can be used for navigation. Note that you may/should add an offset between the marker's pose and the docking pose of the robot, i.e., the pose at which the robot will drive to dock. This is because the goal pose is defined for the base-link of the robot. Therefore, in case you want the tip of the nose of the Origin robot to just hit the marker, then this offset is equal to 0.5 meter (roughly the distance between base-link of the Origin-robot and the top of its nose).

DockPose