Tuesday, June 16, 2015

[HEX] Come on and SLAM

This is the third post on my Hexapod Project series. The point of this project is to build a robot that allows me to try out a few robotics concepts. For a listing of each post in this series, click here. In this post, I'll talk about using laser ranging to let the hexapod keep track of where it is relative to its environment.

Robots are Dumb

It's not hard to find examples of robots failing to perform tasks that a human might deem trivial. Even the entrants to the DARPA Robotic Challenge--while amazingly sophisticated in their design--sometimes fail to turn handles, walk through doors, or even just stand upright for very long. If you are unfamiliar with the subtle challenges of robotics, these failures might make you wonder why it takes so much money and time to create such a flawed system. If a computer can calculate the N-millionth digit of pi in a few milliseconds, why can't it handle walking in a straight line? The general answer to this is that the fundamental differences between the human mind and a computer cpu (neurons vs transistors, programming vs evolution, etc) create vast differences in the intrinsic difficulty of many tasks.

One such task is spatial awareness and localization. Humans are capable of integrating various senses (sight, touch, balance) into a concept of movement and location relative to an environment. To make my hexapod robot capable of autonomous navigation, it also needs to have a sense of location so that it can decide where it needs to go and where it should avoid going.

Arguably the most common way of letting a robot know where it is in the world is GPS. Satellites in orbit around the earth broadcast their position and the current time, and a GPS receiver receives these broadcasts and triangulates its position. A GPS-enabled robot can figure out its location on Earth to within a few feet or better (depending on how much money you spend on a receiver). The biggest issue with GPS is that a robot using GPS needs a clear view of the sky so that it can receive the signals being beamed down from the GPS satellites. GPS also doesn't give you any information about your surroundings, so it's impossible to navigate around obstacles.

For the hexapod, I wanted to avoid using GPS altogether. I chose to use a LIDAR unit for indoor localization, mostly because it seemed like an interesting challenge. LIDAR uses visible light pulses to measure the distance to objects just like RADAR uses radio waves bouncing off objects. A LIDAR unit contains a laser emitter/detector pair that can be swept across a scene to make measurements at different angles. At each angle, the unit emits a pulse of laser light and looks for a reflected pulse with the detector. The delay between emission and detection (and the speed of light) gives the distance to the reflecting object at that angle. High-quality LIDAR units (like those used in self-driving vehicles) can quickly give an accurate 3D scan of the surrounding environment.

More Lasers

The LIDAR unit I picked is taken from the XV-11 robotic vacuum cleaner from Neato Robotics. You can find just the LIDAR unit by itself as a replacement item on various sites; I got mine off eBay for around $80. The XV-11 LIDAR unit has a bit of a following in the hacking community, as it offers 360-degree laser ranging at 5 Hz for much cheaper than anyone else. While the unit isn't open source, there are some nice resources online that provide enough documentation to get started. It only scans in a 2D plane, but what you lose in dimensionality you gain in monetary savings.

Laser module in the lower left, sensor and optics on the upper right.

The LIDAR unit has just 6 wires to control the whole thing. Two are connected directly to the motor that spins the unit; two provide power to the laser and other electronics that do the ranging; and the last two provide a serial connection to the processor. Upon powering the main processor, the following text barrels down the data line at 115200 baud:

Piccolo Laser Distance Scanner
Copyright (c) 2009-2011 Neato Robotics, Inc.
All Rights Reserved

Loader V2.5.14010
CPU F2802x/c000
Serial WTD38511AA-0056881
LastCal [5371726C]
Runtime V2.6.15295

It's hard not to trust any device that sends a smiley face as an opening line. The introduction message isn't too informative, but the fact that it is sent as plain ASCII is comforting. The unit doesn't send any laser ranging data until it is spinning at close to 5 Hz, and the processor has no way of controlling the spin motor. By applying an average of around 3 V to the motor (I did PWM from a 12 V line), the unit spins up and raw data starts flooding down the line. The resources link above provides documentation for how the data packets are formatted, but the key points are that they contain some number of laser ranging measurements, error codes, and the current spin rate of the unit. This measured spin rate can be fed into a feedback loop for the motor controller so that it stays spinning at a nice constant 5 Hz.

I decided to have the Arduino Due on the hexapod handle communication with the LIDAR unit and keeping the motor spinning at the correct rate. The Due already handles communication between the main cpu and the Arbotix-M, so what's one more device? I soldered up a simple board that included an N-channel MOSFET for PWM control of the motor, and a LM317 voltage regulator to provide the LIDAR processor with around 3.3 V.

Motor connects on the left, LIDAR controller on the right. Bottom connects to Due.

The hexapod kit came with a mounting board for adding accessory hardware, but the mounting holes on the LIDAR didn't match up. I 3D-printed a little bracket to both attach the unit to the hexapod body and provide a little space for the board I had just made.

Credit to my Single Pixel Camera project for making me buy a better printer.

Attached to the top mounting panel of the hexapod.

With the small interface board mounted below the LIDAR unit, I connected everything up to the Due. A PWM line from the Due is used to control the speed of the motor, and the Serial2 port is used to receive data from the LIDAR processor. The 12 V needed to power the motor and processor come from whatever source already powers the main UDOO board. In testing, this was either an AC adapter or a 3-cell lipo battery.

I have no idea what I'm testing, but damn does it look technical.

The stream of data from the LIDAR unit is unfortunately pretty non-stop, so I had to write a code that parses little bits at a time to make sure none would get lost in the small input buffer of the Due. The Due bundles up bits of the data into packets and sends them off to the main cpu for further processing. To start out, I just plotted the results in gnuplot:

Rounded object at (50,100) is my head.

Each point is a single laser ranging measurement, and they span the full 360-degrees around the unit. A map like this can be made five times a second, allowing for a pretty good update rate.

A Sense of Self

At this point, the hexapod has the ability to continually scan its surroundings with lasers and accurately determine its distance from obstacles in any direction. But we still haven't solved the problem of letting the hexapod know where it is. By looking at the plot above, we can clearly see that it was sitting near the corner of a rectangular room. If we moved the robot a few feet in any direction and looked at the new map, we would be able to see that the robot had moved, and by comparing the two plots in detail we could even measure how far it moved. As humans, we are able to do this by matching similarities between the before and after plots and spotting the differences. This is one of those tasks that is relatively easy for a human to do and very tricky for a computer to do.

Using only the LIDAR scans, we want the hexapod to be able to track its movement within its environment. By matching new scans to previous ones, we can both infer movement relative to the measured obstacles and integrate new information about obstacles measured from the new location. The process of doing so is called Simultaneous Localization and Mapping (SLAM). There are many ways of solving this problem using measurements like the LIDAR scans I have access to. Some methods involve big point clouds, some involve grids. Some are 3D, some are 2D. One of the most common traits of any SLAM algorithm that I've found is that it is complicated enough to scare away amateur robotics enthusiasts. So in keeping to my goal of writing most (if not all) of the software for my hexapod, I set out to write my own algorithm.

My algorithm is not great, but it kind of works. I decided to do a 2D grid-based SLAM algorithm because a) my LIDAR scans are only in 2D, and b) point clouds are hard to work with. As the name suggests, a SLAM algorithm involves solving two problems simultaneously: localization and mapping. My algorithm keeps a map of the surroundings in memory, and given a new LIDAR scan performs two steps: matching the new scan to the existing map and infering where the scan was measured from; and then adding the new scan to the map to update it with any changes. As the Wikipedia article on the subject suggests, we have a bit of a chicken-and-egg problem, in that you can't localize without an existing map and you can't map without knowing the location. To solve this problem, I let the hexapod know its initial coordinates and let it collect a few scans while standing still to create an initial map. Then, it is allowed to step through the full SLAM algorithm with a map already set up.


For testing, I typically use a map with 128 pixels on a side, and each pixel represents 10 cm. After the initial set up where the hexapod is allowed to create a map in a known location, we might end up with a map like this:

My 350 by 400 cm workroom. My head is still at (50,100).

The value in each pixel can vary between [0,1], and roughly represents how confident we are that the pixel contains an obstacle. I'll go into how this map is made using LIDAR scans in the next section, so just assume it's representative of the environment. A new scan is made and needs to be matched onto this map with some shift in horizontal position and angle that represent the new position and heading of the hexapod. To do this, I've expressed it as an optimization problem. The algorithm tries to find the horizontal shift ($x$,$y$) and angle ($\theta$) that minimize the distance between each new scan point ($x_i'$,$y_i'$) and an occupied map pixel ($M(x,y)$):

\[ \Psi = \sum_i^N D(\left \lfloor{\tilde{x}_i}\right \rfloor, \left \lfloor{\tilde{y}_i}\right \rfloor) + a (x-x_g)^2 + b (y-y_g)^2 + c (\theta-\theta_g)^2 \]
\[ \tilde{x}_i = x_i' \cos \theta - y_i' \sin \theta + x \]
\[ \tilde{y}_i = x_i' \sin \theta + y_i' \cos \theta + y \]
\[ D(x,y) = \sum_{x'} \sum_{y'} M(x',y') \sqrt{(x-x')^2 + (y-y')^2} \]

Here, we project each LIDAR measurement ($x_i'$,$y_i'$) on to the SLAM map, adjusting for the current best guess of ($x$,$y$,$\theta$). At each projected point, we sum up the distance from that point to every occupied pixel of the SLAM map. This gives us an estimate for how 'far away' the projected scan is from matching the existing map. The three extra terms on the $\Psi$ equation are to bias the solution towards guess values for ($x$,$y$,$\theta$).

In this way, we are finding the location of the hexapod so that the new LIDAR scan looks most like the existing map. The assumption being made here is that the new scan is similar enough to the existing map that is can be matched with some confidence. Solving the above equation is a problem of non-linear optimization, similar to the inverse kinematics solved in the previous post. The code to solve this problem is a little dense, so I won't try to explain all of the details here. The relevant code is here, and the relevant method is slam::step(...);.

In words, we compute the $\Psi$ equation above and how it changes if we modify each of the parameters ($x$,$y$,$\theta$) by a small amount. Using this information, we can nudge each parameter by an amount that should get us to a lower value of $\Psi$. Since the problem is non-linear, we aren't guaranteed that this gets us to the lowest possible value, or even a lower one than before. To help make sure we end up in the right place, we initialize the solver with a guess position based on how the hexapod legs have moved recently. Since we went through so much trouble in the previous post to plan how the feet move, we might as well use that knowledge to help the localization solver. From there we iterate the nudging step over and over again with a smaller nudge until we find there is no way of nudging it to a lower value of $\Psi$. This is when we stop and say we have found the optimal values of ($x$,$y$,$\theta$). With that, the localization step is done!

For computational efficiency, I keep three versions of the SLAM map other than the normal one shown above. The first extra one is the original map convolved with a distance kernel, which at any position gives us an approximate distance to occupied pixels. The next two are the gradient of this distance map, one for the x-component and one for the y-component. These maps allow us to quickly evaluate both the $\Psi$ function and its derivatives with respect to ($x$,$y$,$\theta$). The distance map is computed in Fourier space using the convolution theorem, using threaded FFTW for computational speed. This method doesn't actually give us the correct distance measure for $\Psi$, but it's close enough for this basic algorithm.


The companion to localization is mapping. Once we have a solution to where the new scan was measured from, we need to add it to the existing SLAM map. While we have assumed the new scan is close enough to the existing map to be matched, it will have small differences due to the new measurement location that need to be incorporated so that the following scan is still similar enough to the map. In my SLAM code, the method that does the mapping step is slam::integrate(...);.

Each new laser ranging measurement from the new scan is projected on to the SLAM map given the estimated hexapod location from the localization step. The pixel below each point is set to 1.0, meaning we are fully confident that there is some object there. We then scan through every other pixel in the map and determine whether it is closer or farther away from the hexapod than the new scan measurements. If it is closer, we decrease the map value because the new scan measured something behind it, meaning it must be free of obstacles. If the pixel is farther, we leave it alone because we don't have any new information there; the new scan was blocked by something in front of it.

Once this mapping step is done, we have completed the two-part SLAM algorithm and are ready for another LIDAR scan. It's not the best or most accurate method, but it is easy to understand and can run on fairly low-level hardware in real-time. I've written the algorithm to run asynchronously from the main hexapod code, so new scans can be submitted and the hexapod can still walk around while the SLAM algorithm figures out where it is. On the UDOO's Cortex-A9, I can step a 1024x1024 map in around 2-3 seconds. With a 10 cm resolution, this gives over 100 meters of mapping. In practice, I've found that 10 cm is about the coarsest you can go in an indoor environment, but anything less than 3 cm is a waste of computing time.


To demonstrate graphically how my SLAM algorithm works, I've written a little JavaScript app to show how LIDAR scans relate to the SLAM map. It doesn't actually go through the whole algorithm from above, but it does show roughly what happens. The GLOBAL view shows the simulated robot moving through an environment, measuring distances to obstacles with a simulated LIDAR scanner. The ROBOT view shows what these LIDAR measurements look like from the minds-eye of the robot. It doesn't know that it is moving; instead, it looks like the world is moving around it. The SLAM view shows a simulated SLAM map and the approximate location of the robot. Move the ERROR slider back and forth to simulate varying amounts of noise in the measurements. These reduce the accuracy of both the localization and mapping methods.

Canvas not working!

I've also tested this algorithm out in real life with the hexapod. The following SLAM maps were collected by driving the hexapod around my apartment with a remote control. I started the hexapod out in one room, and it was able to walk into a different room and keep track of its position. The maps are pretty messy, but acceptable considering the simplicity of the algorithm being used.

Noisy map of my apartment.

It's not the best SLAM algorithm in the world, but it's relatively easy to understand and compute in an embedded setting. It seems to do best when the hexapod is inside a closed room and can see at least two of the walls. It has some issues keeping track of position when transitioning between rooms, mostly due to the sharp changes in the LIDAR scans when passing through a doorway. Still, it does a reasonable job at keeping track of the hexapod location within my apartment.

In the next post, I'll sort out how to get the hexapod to navigate autonomously. With the algorithms presented so far in this hexapod series, it becomes a straightforward procedure of using the SLAM maps to find optimal paths to pre-determined waypoints.