Heuristic vision based terrain recognition for lower limb exoskeletons

For lower limb exoskeletons to be used for daily activities, it has to understand the diverse nature of its operating terrain. Computer vision offers an excellent avenue for terrain recognition for such applications. In this paper, we propose a heuristic vision-based approach for terrain recognition in lower limb exoskeletons, using line fitting features and active sensor fusion with exoskeleton encoders to enhance recognition performance. Our approach recognizes more terrain classes than traditional methods, including obstacles and gaps, enabling better terrain traversal. We estimate contextual features using sensor fusion to pre-process the point cloud and extract better discriminative terrain features for recognition. The recognized terrain and distance to it are used to modify gait strategy of the exoskeleton. The proposed approach has been validated with offline and online experiments. Validation experiments show successful integration with exoskeletons for traversal of different terrains.


INTRODUCTION
Lower-limb exoskeletons and prosthetics aim to help patients who have reduced mobility due to various factors such as stroke and spinal cord injuries.But exoskeletons are largely restricted to controlled environments and not adopted for daily usage due to their weight, power consumption and limited adaptability to various terrains.To ensure safe traversal of users, it is necessary the exoskeleton can recognize and understand its operating terrain and change its gait strategy accordingly.Terrain recognition refers to classifying the operating environment as ground, stairs, slopes etc. while understanding refers to estimating the characteristics of the terrain such as distance to terrain, terrain height, width and angle.
While recent progress in deep learning has led to the development of vision-based terrain recognition methods for exoskeletons using RGB images [4,3] and point clouds [10], these methods often have limitations.They may be trained to detect only a limited number of terrain classes or have limited accuracy to work with actual exoskeletons.Methods [9,6] achieved higher terrain recognition performance but were tested in simple settings only.
Existing terrain recognition methods typically consider only 5 terrain types, namely, level ground, ascending stairs, descending stairs, up slopes and down slopes, limiting their applicability to different day-to-day situations.Other terrains such as obstacles [5,8], negative obstacles (gap) are considered as separate problems.Most available methods do not handle negative obstacles.In our work, we develop a heuristic vision-based approach that accounts for seven terrain classes, including obstacles and gaps.Based on [10], we pre-process 3D point clouds into 2D binary images and use line fitting to identify key terrain features.Based on these features, we classify the observed terrain and its distance from the user to modify exoskeleton gait parameters accordingly.
Previous works [4,3,10] on terrain recognition do not use any robot information to improve recognition performance.The joint encoders of exoskeletons can provide useful contextual information that can improve vision-based terrain recognition.In this work, we develop an active sensor fusion between exoskeleton's joint encoders and perception module to extract contextual features.Similar to [1], that used IMU to determine ground or camera height, we use the data from exoskeleton encoders as context to improve the above-mentioned pre-processing step to ground the binary image.This helps us to implicitly extract better features for terrain recognition.The robot information helps to improve the accuracy of the perception module.The recognized terrain and its distance from the robot are provided by visual perception, so that the robot can modify its gait.We integrate the proposed terrain recognition framework with Phoenix SuitX exoskeleton [7] and test it on various terrains.In this paper, we develop a vision-based terrain recognition algorithm for a lower-limb exoskeleton where one leg is healthy.

HEURISTIC APPROACH FOR VISION-BASED TERRAIN RECOGNITION
In this section, we introduce our heuristic approach for vision-based terrain recognition for lower limb exoskeletons.For terrain recognition, we consider seven classes, namely, Level Ground, Ascending Stairs, Descending Stairs, Up Slopes, Down Slopes, Obstacles and Gaps.Our proposed approach can be divided into 3 components, namely, point cloud pre-processing for dimensionality reduction, using contextual features from exoskeleton, line fitting and feature extraction for terrain recognition.We elaborate on these components in the following subsections.
2.1 Point cloud pre-processing for dimensionality reduction [10] proposed a pre-processing step that converts 3D point clouds (from camera and IMU) to a 2D binary image for dimensionality reduction and used deep learning to learn the resultant binary images of the terrain.The resolution of these images was 100 × 100, and they implicitly encoded the distance observed in the camera's field of view by imposing limits on the , ,  axis.Refer to [10] for the limits imposed on each axis.This step converted each point on the point cloud to a pixel and produced 2D binary images that shows side view of the terrain ahead.
There are two main limitations in this pre-processing step.Firstly, while imposing the limits on the forward  and height  axis, all points are grounded with respect to the observed minimum   and   .Due to this, the actual ground on which the user is standing is lost.The inclusion of such context could help in differentiating the 7 classes better.For instance, when the user is very close to a single elevated step or curb, the only observable terrain might be curb and so the minimum height   might just be from the step.Secondly, based on the position of the camera, artefacts such as the user's leg and foot might appear on the image, which could confuse the terrain recognition.We introduce the contextual features from the exoskeleton that we include to handle these limitations and improve the terrain feature extraction.

Contextual features from exoskeleton
To handle the above-mentioned limitations, we use sensor fusion to obtain robot information from exoskeleton's joint encoders as contextual features.In our proposed approach, the camera is attached to exoskeleton via a mount attached to hip joint of the exoskeleton.Using the information of the corresponding joint encoder, we can get the hip height of the user, which can be converted to camera height from the ground (  ).To implicitly encode   into the binary images, we change the height limits imposed by [10] and consider only points that are   − 0.5 ≤  ≤   + 0.5.
Similarly, from the encoders, we could also obtain the distance of both right and left foot from the hip.This can be used to estimate the distance of both feet from the camera.Using the obtained distance (   ), we impose conditions on the  axis points, so that the terrain observed till the user's foot is removed from the binary image ( >    ).This helps us to remove artefacts, thus, improving the feature extraction and terrain recognition performance.These contextual features form the backbone of the proposed approach.The perception module provides the terrain information to the exoskeleton and in turn gets contextual features   and    from it.

Line fitting and Terrain Feature Extraction
As mentioned earlier, the dimensionality reduction step doesn't ground the 3D points with respect to the ground, rather it is based on the minimum observed height.Due to this, the points on the lower rows of the image would correspond to points at a lower height in the point cloud.To implicitly encode the ground/camera height into the binary image, we assume that ground is always at the 50th row of the image.That is, all points with  =   are mapped to the 50th row (midpoint of the image).The above-mentioned height limits would automatically encode the ground, terrain above and below the ground.The 2D images produced for each of the seven classes can be seen in Figure 1.
To recognize the terrain, we extract features using RANSAC line fitting [2].Due to the implicit grounding, we can clearly differentiate whether the terrain is above ground (obstacle, ascending stairs, up slope) and below ground (gap, descending stairs, down slope).We divide the image row wise into bands, each of size 15 rows and start with the band containing the ground band and move either upwards (above ground terrain) or downwards (below ground terrain).In each band, RANSAC fits a line model and we extract the following features using it.
• Starting point and width of line.
• Minimum and maximum height of line.
• Angle of fitted line.
• Horizontal or vertical or angle line -determined by imposing thresholds on the angle of fitted line.
From the identified set of lines, we determine ground level line as the one closest to the ground height and in the ground band of the image.We use the above-mentioned features of this ground line as reference.The ground width can be used to know if ground is visible beyond the other terrain lines.This is particularly useful to differentiate obstacles, stairs and slopes.Since, obstacles are smaller in size, the ground beyond the obstacle would be visible even as we get closer to obstacle.Thus, the ground width would also be more.For ascending stairs and up slopes, since the terrain would be bigger the ground would not visible thus, reducing the width of the ground line.The angle of the terrain line can then differentiate whether the terrain is a stair or slope.Depending on whether terrain exists above or below the ground level, we can also differentiate ascending and descending stairs, up and down slopes, obstacle or gap.But as mentioned earlier, it is possible that the ground line is not visible and couldn't be identified by RANSAC.In these cases, we can assume to have a virtual ground line at the 50th pixel of no width.This helps us to even handle cases when the ground is not visible.Once the terrain lines are identified, since the 2D image is encoded by the distance from the camera/exoskeleton, we can use the start point of the terrain line to know how far the terrain is from the user and adjust the gait accordingly.

INTEGRATION TO LOWER LIMB EXOSKELETON
In this section, we discuss how we integrate the proposed terrain recognition module to the Phoenix SuitX exoskeleton [7].The exoskeleton is fitted with a NVIDIA Jetson Nano board powered by a battery supply and voltage regulator.This allows the exoskeleton user to move around freely as it is not connected to any sockets.All the modules including the proposed terrain recognition, control module are run on the Jetson Nano, which is connected to a network via WiFi Adapter to allow remote access if necessary.All the modules are linked via ROS.To test our proposed approach, a Intel Realsense D435i camera is mounted to the hip joint via a shaft (known size) such that camera is facing downwards.The exoskeleton SDK provides the current status of various joint encoders and IMU in the exoskeleton.The contextual features, namely, camera/ground height (  ) and distance of feet from camera (   ) are estimated using the robot information (from SDK), shaft size and camera inclination angle (from camera IMU).Using these contextual features, our proposed heuristic approach recognizes the terrain and provides the control module with the recognized terrain and distance to the terrain.We use a control strategy similar to [8] to approach the terrain and traverse the same.Before every new step is taken the control approach queries the perception module to know the terrain.Based on the recognized terrain, the control strategy modifies the gait parameters to successfully traverse on each of the seven terrain classes detected.

EXPERIMENTAL RESULTS AND DISCUSSION
The proposed framework includes point cloud pre-processing, including contextual features and terrain feature extraction using RANSAC.To evaluate the effectiveness of each of these components, we conducted two experiments.The first experiment was conducted on simulated point cloud data to test point cloud preprocessing and terrain features.In the second experiment, the proposed framework was integrated with a lower limb exoskeleton and tested if healthy subjects can walk on different terrains with it.

Effectiveness of Terrain Features
To evaluate the proposed terrain features, we conducted experiments on simulated point cloud data that were generated by us.In these cases, we assume that ground/ camera height is available and hence can easily center the ground line to the center row. Figure 1 shows 2D binary image generated for each class.The terrain characteristics such as distance of terrain, terrain width, height and angle were modified.The amount of terrain visible was also changed.Random noise is added to the point cloud distribution to simulate real time scenarios.Table 1 provides information on amount images generated for each class and the performance of the proposed terrain features on them.From table 1, we can clearly see that our method can handle all of the terrain classes very well, with slopes being more difficult for the system.In case of slopes, RANSAC's line fitting algorithm estimates the slope angle and we decide whether it is a horizontal or vertical or angled line based on angle threshold set by us.In the simulated data, the slope angle is varied from as low as 5 degrees to 15 degrees.If the slope angle of the fitted line is wrong, then slopes could not be identified.Based on this evaluation, we fixed the angle thresholds to allow detection of 5 degrees in the following set of experiments.

Integration with Exoskeleton
The proposed vision-based approach was integrated with the exoskeleton as mentioned in Section 3 and tested with healthy subjects.In our experiments, we assume that only the right leg is affected.Subjects were required to complete the full course of ground, stairs, slopes, obstacles and gaps.We had to cover our walk course, made of aluminium and steel, with non-reflective mats due to the reflective nature of these surfaces affecting the point cloud and consequently the proposed method.The objective of the experiment was to identify if the proposed method can work while ensuring balance and safety of the users.Using the proposed approach, the users were able to complete the course correctly without any issues.From an algorithmic point of view, we wanted to study the effect of contextual parameters   ,    .We observed that there was slight variation observed in the   provided by the exoskeleton and camera.Factors such as camera inclination, movement of the exoskeleton joints etc affected the   and made it challenging to obtain ground truth.But   was only used as a guidance for centering the image, therefore, the effect slight variations didn't affect the performance of our method.We plan to do a quantitative analysis as part of our future work.When    was not considered, the subject's foot could appear on the image resulting in wrong terrain detection.By including the contextual feature    , we can successfully remove the artefacts and detect the terrain correctly.Figure 2 shows subjects walking with exoskeletons running with our proposed heuristic vision-based terrain recognition method.Overall, 88.41% of real terrain images were correctly classified during our trials with exoskeleton.In the future, we plan to include more contextual features from the exoskeleton to combine terrains such as obstacles or gaps on slopes.Apart from distance to terrain, other characteristics such as height, width and orientation would also have to be estimated.Instead of using line features that might work well for obstacles like wooden blocks, we plan to extend to other shapes and incorporate shape characteristics for better terrain recognition and characteristics estimation.

CONCLUSION
In this paper, we introduce a computer vision-based terrain recognition method for lower limb exoskeletons, using heuristic features obtained from line fitting to detect more terrain classes than existing approaches.Our approach relies on active sensor fusion with exoskeleton encoders to obtain contextual features, such as camera/ground height and distance of feet from the camera in the forward direction, to pre-process the point cloud.The contextual features help us to implicitly extract discriminative features for terrain recognition.The identified terrain and its characteristics are used to modify the exoskeleton's gait, with good performance demonstrated in both offline and online tests.

Figure 1 :
Figure 1: Terrain Images obtained from the Simulated point cloud for each class.The implicit encoding of ground can help to differentiate the terrains on the top row and bottom row easily.

Figure 2 :
Figure 2: Testing proposed terrain recognition integrated with exoskeleton.All 7 terrain classes were successfully detected by the camera shown in red circle.Obstacle and ascending stairs are shown here.

Table 1 :
Performance of the proposed method on simulated 7 classes terrain data