Perception

Components that give the robot the ability to become aware of itself and its environment through sensors.

Introduction

One of the 5 high-level components of the Autopilot Inference is the Perception stack. The Perception stack should ensure that the robot is able to perceive its environment using the sensor data that is available. We specifically state “perceive its environment“, which means that this stack is not responsible to perceive the robot itself, such as its position, nor is it responsible to understand its environment, for example by conducting complex reasoning. Instead, the Perception stack filters and merges sensor data by comparing different modes of data (range-based, vision-based, etc.) and by applying mathematical models. The output of the Perception stack is thus an enrichment of individual data streams, typically originating from sensor on the robot, into tangible information that can be used by the other stack in the robot, such as the Navigation stack and the Cognition stack. There is off course a grey area as to where the Perception stack finishes its processing and another stack takes over, for example for a case of using landmarks to estimate ones position. Our line of reasoning on this matter will become clear in the subsequent sections of this page on the Perception stack.

Capabilities

The capabilities that the Perception stack provides have been designed to support the robot in pre-processing of sensor data and in finding its whereabouts in the environment. Therefore, the capabilities of the Perception stack depend on what information is requested from the other stacks.

Landmark detection is a capability to detect specific landmarks (Aruco markers) and the position from the robot to the marker. An Aruco marker is a synthetic square marker that is easily detectable in any environment. Our robots are shipped with 3 Avular branded Aruco marker, having the ID 0, 1 and 2, but you are free to print additional markers using this inline generator. Markers should be printed so that they are 20x20 cm in size. This capability of the Perception stack adopts the following three stages:

Firstly, the Aruco marker is detected with its corresponding ID.
Secondly, the position of the camera is estimated based on the origin of the Aruco marker (top-left corner)
Thirdly, the position the the Aruco marker is estimated within the camera frame (note that this frame is defined by a Y-axis pointing downwards and the Z-axis to pointing to the front)

PerceptionArucoDetection

Filtered LiDAR is a capability to filter the points that are measured by the 3D LiDAR of the robot (when this LiDAR is integrated on the robot). Normally, the 3D LiDAR system produces a pointcloud around the robot in every direction, i.e., in 360 degrees with an opening angle of either 45, or 90 degrees. Points are produced by a laser pulse that follows a particular scanning pattern within this omnidirection view of 45x360 or 90x360, until the laser hits an object and a range measurement from sensor to object is returned. Therefore, as the LiDAR is mounted on the robot, it is unavoidable that some of the laser pulse will hit the robot itself, or a component that is mounted on the robot, such as a GPS receiver, or WiFi antennas. It will also hit the turning wheels of the robot, generated points that are scattered randomly in time due to the fact that the laser is hitting a quickly turning wheel. All these points that are generated by the robot itself and not by the environment should be removed from the pointcloud, as operators and developers as are typically interested in the environment of the robot and not in the robot itself. Furthermore, localization methods exists for both a 3D poitncloud, in which the points are within the omnidirectional view, as well as a 2D laser scan, in which the points are part of a plane that is parallel to the ground floor. Therefore, the Perception stack has a pointcloud filter capabliity that produces two types of data:

A 3D pointcloud that is equal to the pointcloud produced by the LiDAR system, see the left image below, yet where all points generated by the robot itself are removed.
A 2D laser scan where in a 360 degrees volume around the robot, i.e., between a minimal and maximum height, points are projected on a 2D plane (points related to the body of the robot are removed).

PerceptionCloudfilter

Pose to dock is a capabliity for estimating the position of the dock with respect to the robot. Since any robot of Avular has a built-in camera, and since there is already a capability to detect Aruco marker, we have chosen to use the specific Aruco marker with ID = 0 to be fixed on the physical dock (note that the dock is not yet released to customers). Then, by calling a service to start publishing the dock-pose as a goal topic (a navigation goal), this capability will start publishing the relative pose from the Aruco marker with ID=0 (that is assumed to be mounted on the dock) to the robot on a specific topic.

PerceptionDockTracking

Info

Avular collaborators are authorized to read the details of the developments, which is found on the Development-Perception