Sunday, October 19, 2014

Single Pixel Camera

Please forgive me for the lack of new content over the last five or so months. After my most recent post about neural networks and quadcopters, I decided to work on a couple smaller projects. First I replaced the bulb in a standing lamp with RGB LEDs, added an Arduino Pro Mini and an RTC, and programmed the lamp to change color temperature based on the time of day. This wasn't really worth a whole blog post by itself, so I moved on to the next project. I then taught myself to sew and reupholstered my couch with white canvas. No electronics, no programming, just lots of sewing. Again, not really worth a blog post. In all, I had spent most of the summer on small projects that couldn't really top the last few posts I've done.

But I would be remiss if I did not provide something to show for the last five months. So here is a picture of our cats, Fitzcarraldo (left) and Alan "Bonesaw" Turing (right). Alan was born June 2014, so I guess he happened too.

An offering of cats to please the internet gods.

But to the topic at hand: I've been interested in cameras and image processing for many, many years. I've written all kinds of code to filter, sort, encrypt, convolve, and de-convolve images, and I've built gadgets that were made to be photographed. But one thing I hadn't done until this project was build a camera.

What's the easiest way to build a camera? My guess would be to build a simple pinhole camera with real film. All you would need is something like a shoebox and some film. Assuming that you could figure out how to handle the film in a dark enough room so it didn't get exposed to light before you wanted it to, and that you had a place you could take the film to get processed, this might be the simplest way of building a camera.

But I'm picky, and wanted a digital color camera. Most modern digital cameras work on the same principle: replacing the sheet of film from our simple pinhole with a dense array of light sensors, and replacing the pinhole with a glass lens (or a set of glass lenses). The lenses are used to gather more light than the pinhole could allow and focus the light on to the sensor array. The array contains millions of sensors (pixels), each with either a red, green, or blue filter in front. You point the camera at the object you wish to take a picture of, focus the lenses, and tell the pixel array to gather light for a little while. When you read out the values each pixel has recorded, and use a bit of software to arrange the colors and pixels on a screen, you see the image you want. Simple enough!

But let's say I didn't want to cough up the money for a pixel array or even lenses. In fact, all I could afford was a single color light sensor, some basic electronic components, and whatever I could print on a 3D printer. How do we make the MacGyver of digital cameras? First we need a hand-wavy theory of how to build an image with only one sensor.

Hand-Wavy Theory

Imagine you are standing outside in a field on a clear day with the sun shining in front of you. Why are you in this field? I don't know, maybe you are having a picnic. When you stare forward, your eyes help your brain form an image of the scene in front of you. But your eye has a lens and a dense array of sensors (your retina), and we don't want that luxury. So you close your eyes. Your entire retina is reduced to picking up just the light that hits your eyelids, giving you just one measure of the scene in front of you: the total integrated light hitting your face. In this single-pixel scenario, you measure the light from every object in the scene all at once, but you don't know where each object is or how much of the total light came from it. But you can start to build a mental image of where things sit by using your hand. Still standing in a field with your eyes closed, you hold your hand out to block light and wave your arm around (all good hand-wavy theories involve actual hand-waving). As your hand blocks light from different angles, the total light hitting your face is reduced somewhat. By knowing where you've placed your hand in front of you (assuming decent spatial awareness), you can get an estimate of how much light was coming from that direction. For example, you can figure out what direction the sun is by waving your arm around until you notice most of the light has disappeared. Then you know that the sun is in the direction connecting your face to your hand, and extending off into the sky.

Using this principle, we can build a digital camera with a single pixel as long as we can also build some kind of 'arm' that can be moved around in front of the sensor. The sensor continuously records the amount of light hitting it while the arm moves around in front. A computer reads the sensor measurement, considered the arm placement, and builds up an image in memory. The next question is: how well can this method actually produce an image?

Testing the Theory

A good first step to any potentially crazy idea is to simulate it on a computer. If it doesn't work there, either your simulation is crap or the idea is bad. If it does work, either your simulation is crap or the idea might work in the real world. While these options don't make the simulation seem very worthwhile, they are often much simpler than trying things in real life, and can save a lot of money and tears.

Below, I've provided a couple little programs that demonstrate the theory of producing an image by blocking light in a systematic way. The first simulates a light-blocking arm sweeping across the field of view of a sensor, along with a plot of the total light collected as a function of arm position. The panel near the top right shows the color corresponding to the given combination of red, green, and blue at any given time.

This gives us an idea of the data we can record by moving a blocking arm back and forth across the field of view. Next, we not only vary the position of the blocking arm, but also the angle at which it moves. By sweeping the arm across at many different angles, we can build up a two-dimensional plot of the total integrated light hitting the sensor as a function of position and angle. This is shown in the next program as a color map.

The resulting image is fascinating, but doesn't look like the original scene at all. At each arm angle (plotted along the x-axis), the same object in the scene will be 'recorded' at a different arm position (plotted along the y-axis). This causes each object to become a stretched out and curved object in the recorded image. Luckily, there is some clever math that can get us from the stretched out recorded image to a nice reconstruction of the original scene.

Radon Transforms

When I first looked at one of the simulated recordings from this simulation, I was ecstatic. I immediately recognized it as the Radon Transform of the original scene. Admittedly, this is not a normal thing to immediately recognize. A significant portion of my PhD work is spent thinking about similar transforms, so I have things like this on my mind.

Why care about transforms like this? An integral transform is essentially a method for taking a signal and mushing it up in a particular way that makes the result more understandable. The most common example is the Fourier Transform, which is useful for breaking down a signal into waves. An example of when a Fourier Transform might be useful is when analyzing an audio source. This transform will separate the audio into its component frequencies and tell you how much 'power' is in each separately. The Radon Transform is a little more obscure, but one example of its usefulness is detecting hard edges in an image. An important feature of many integral transforms is that they can be inverted to give you back the original signal. The inverse of the Radon Transform (also called a filtered back projection) is most commonly used in CT scans, where a sensor array measures the projected image of someone's insides and the final image is reconstructed from many projections.

Our simulated sensor-and-arm camera has given us the negative Radon Transform of the scene it tried to image. All we need to do is perform the inverse of the transform to the recorded image and we should get an image of the original scene. One issue with this procedure is resolution. When making the camera, we need to pick the width of the blocking-arm, how finely we can move it across the scene, and at how many angles we choose to do this at. All three of these choices determine the resolution of both the recorded transform and the final reconstructed image. After a bit of playing around with some test images, I settled on a resolution that would keep the total scan time for the camera reasonable.

With a solid theory in place with fancy enough math behind it to look impressive, I could now continue on and build the single pixel Radon Transform camera in real life.

Constructing the Camera

The main components I used to build this camera were:
 - Arduino Pro Mini
 - Color Light Sensor
 - MicroSD Card Board
 - 2 Servos
 - Battery Pack

The SD card board and battery pack added a bit to the cost, but were added to make the camera portable, with the hope I could take the camera outdoors and take some pictures of mountains and fields and things. They were not necessary for indoor, tethered use of the camera.

The first big hurdle was designing the parts to be printed. I'm not an engineer, I'm a scientist. I don't even enjoy saying that physicists are good at doing an engineer's job. Making things that don't immediately self-destruct is non-trivial. This was probably the most complicated thing I've had to design, and the number of revisions I had to go through shows that. About half-way through, my 3D printer died an ungraceful death (after a year of use and a few pounds of filament), so I upgraded to a better printer and enjoyed a marked increase in print quality by the end.

Printed in 8 parts, in case you were wondering.

Left: old dead printer. Right: new not-dead printer.

Various attempts at being an engineer.

Glam shot with all electronics magically completed.

MicroSD card reader on the side.

Attached to my tripod, ready for testing.

The circuit to control the camera was as simple as I could make it. I soldered the Arduino board to a protoboard and added connections for the servos, sensor, battery pack, and SD card board. Once I confirmed that everything could be controlled by the Arduino, I moved on to the code.

The main problem I encountered while building the camera was the fact that the servos would not rotate the full 180 degrees I expected to see. In fact, using the default Servo library for Arduino and the standard servo.write(); command, I only saw about 100 degrees of rotation. As it turns out, different servos have different standards for how to make them turn to various positions. The Servo library assumes that a pulse-width of 1000 us corresponds to 0 degrees and a pulse-width of 2000 us corresponds to 180 degrees. In the servos I bought, these pulse-widths corresponded to about 70 degrees and 170 degrees, respectively. By manually sending a pulse-width of 2100 us, I could get the servos to rotate close enough to 180 degrees, but getting the low-angle end was trickier. At a pulse-width of 450 us, I was getting down to around 30 degrees, but any lower than that caused the servo to swing to zero and bounce around. My guess is that the internal servo electronics aren't capable of handling pulse-widths shorter than about 450 us.

So in the end, I could only get around 150 degrees of rotation out of the servos. This wasn't too much of an issue for the servo that moved the blocking arm in front of the sensor, since the sensor probably couldn't tell if light was hitting it from those extreme angles anyway. But the limited range was a significant problem for the other servo that rotated the whole arm contraption around the axis of the sensor. This limitation is like chopping off the right end of the simulated Radon Transform images in the above animations. Without information from some band of angles, the image reconstruction is unconstrained and will show significant banding. I thought about this problem for a few days before running out of ideas and searching the internet for a solution. The good news is that this problem is not uncommon and there are research groups around the world thinking about it. The bad is that there isn't some magical way to infer the information lost at those angles, only methods to mitigate the artifacts introduced in the inverse transform. The best solution is to record as many angles as possible and be aware of the limitations of what you've recorded.

Programming the Processor

I decided to save the inverse Radon Transform computation for post-processing on my workstation, so that all the camera had to do was record the sensor values and store them in a sensible way. This led to a fairly simple code flow for the on-board controller:

I've excluded some bits of code that do things like set the servo position and write data out, mostly to simplify the code above. These extra bits aren't too important and can be done in a variety of ways.

Here is the flow of how the camera works: once the camera initializes, it determines the correct sensor gain and begins scanning the scene. Each measurement is saved to a file on the SD card. I collect this data and move it to a file on my laptop for post-processing. Before the raw data can be transformed from Radon-space into real-space, it needs to be adjusted.

One issue with the camera is the long exposure time. During the ten or so minutes it takes to record a single image, the lighting in the scene can change dramatically. The biggest cause of lighting changes I ran into was clouds. This kind of change in lighting would cause the recorded data to vary in brightness over time, which would imprint into the final image as a variation with position. To ameliorate this, I computed a weighted mean of the total brightness at each arm angle and used it to compute the negative image. This way, the light could change slowly over time and the moving average would account for it.


The first test scene was of a lamp and a red flashlight. This provided a few simple diagnostics, like checking to see if a bright point source would get smeared at all in the final reconstruction.

The first panel shows the raw values from the camera mapped as a function of arm angle and position and scaled between 0 and 255 in each color channel. The second panel shows the negative scan, where I've gone from measuring the total light at every arm position to measuring the light blocked at every arm position. The third panel shows the inverse transformed scan, which is ideally the reconstruction of the actual scene. The final panel is a comparison image taken with a normal camera.

The raw scan and negative show exactly what we would expect from the Radon transform. The two main objects show up as smooth curves as a function of arm angle. The resulting image isn't quite as clear. You can tell that it picked up on the bright light and a bit of the red light, but there is a huge amount of aberration present, particularly around the bright light. This butterfly-looking pattern shows up when the full 180 degrees of arm rotation aren't recorded. The angled lines radiating from the light show you the limits of angle the arm could achieve. With this limitation in mind, I moved on to recording more complex scenes.

Close enough?

Next up was a few objects that were sitting around on my desk. I wanted to know how well the camera could record different colors, so I placed some yellow sticky notes and red wire next to the blue painters tape I had on my printer. The raw scan and negative look pretty sensible, and the resulting image actually shows some resemblance to the real scene. You can pick out the blue region, the bright yellow spot on the right, the red blur on the bottom left, and even the red printer nozzle above the blue region. The whole image looks like it was taken underwater or something, but then again, it's an image made by a single-pixel camera. I'm surprised it works at all.
View from my balcony.

Next up, I took the camera out to my balcony and took a picture of my morning view. No simple point sources to make things easy, but the result looks pretty good. The raw scan and negative look nice and smooth, and the result has most of the main features present in the real scene. You can see the horizon, the tree right in the center, and even a little of the parking lot in the bottom right. Of course, I spent time and money making the camera portable, so next up I needed to go on a little photo trip.
Living in Colorado has its benefits.

One of the nice things about living in Boulder, CO is that you are never far away from beautiful scenery. I took the camera up the Front Range and let it record a view of the Rocky Mountains in the distance. The raw scan and negative look pretty good, but have a lot of sharp jumps as a function of arm angle. I'm really not sure where these came from, but I suspect either my battery pack was dying or my servos were giving out. Even with these problems, the resulting image looks great.

So there you have it, a single pixel camera that takes color images. It may not produce the highest quality photographs, but it's certainly enough of a proof-of-concept to be worth a post. Given some more time and money, a better version of this camera could be made to take better pictures. But I'll leave that as an exercise for the reader.

This project took much longer for me to complete than it would have a year ago. As I inch closer to finishing my PhD work, I can afford less and less time to projects outside of work. I think it's time to say that I won't be updating this blog any more, or at least until I'm done with school. Of course, I say this, but who knows what kind of small projects I'll be able to do in a day later on. Fortunately, I think this blog has had a good run, from Daft Punk helmets to LED Planets, from Neural Networks to Robots that Slap People. While this may just look like a collection of eccentric and technical projects, it has really helped me figure out what I might be good at (outside of science) and what I enjoy doing. I know my documentation here has helped a few people with their own projects, and I hope they continue to do so in the future. Thanks for reading!

Wednesday, May 21, 2014

Quadcopter Stability and Neural Networks

This is a fairly long post, so here's a TLDR: Quadcopters use an algorithm called a PID controller to not fall over in the air, but a more complicated algorithm involving neural networks may be just as stable if not more.

A few times now, I've mentioned my desire to write a post about the hexacopter I build long ago. The project started almost two years ago and never really finished. The goal was to not just build a hexacopter for my own personal use, but to build and program my own flight controller from scratch.

Building a hexacopter or quadcopter using a standard pre-programmed flight controller is not too hard. You need a frame, a battery, motors, propellers, speed controllers (motor drivers), a flight controller, and other miscellaneous electronics (radio, gps, etc). The flight controller tells the speed controllers how fast each motor and propeller should be spinning. It monitors the orientation of the craft using an accelerometer and gyroscope and continually adjusts the motor speeds to keep the platform stable. If for some reason the copter tips to the left, the flight controller needs to bump up the thrust on the left side to rotate the body back to a level position. The flight controller can also intentionally tip the body in some direction in response to commands received from the radio link.

Now, why have I used the word quadcopter (four propellers) in the title of this post instead of hexacopter (six propellers)? Because in terms of the flight controller, there is little difference. Sure, the process of assembling a hexacopter is different than that of a quadcopter, but only in that the frame is a little different and there are two more of a couple components. Quadcopter is a generally more recognized term, so I'll stick with what from now on.

1 - A Problem with a Solution

So what's the most important part about making your own flight controller? Giving it the tools to keep the quadcopter flying. Handling radio commands, GPS coordinates, data logging, and pathfinding are really all extraneous tasks compared to the ability to stay in the air without flipping over and nose-diving into the ground. Any amount of force from wind or otherwise will cause the quadcopter to tip in a random direction. It's up the flight controller to detect these forces and compensate for them by adjusting how fast each propeller turns.

To simplify my discussion on how a quadcopter can stabilize itself, I'll reduce the problem some. Imagine a solid rod with a motor and propeller pointed up at one end, and an unmoveable hinge at the other end. Gravity pulls the arm down, and the thrust produced by the motor and propeller pushes the arm up. A sensor can measure the current angle of the arm as well as a few other related quantities. The goal is to develop an algorithm that will cause the arm to spend most of it's time perfectly horizontal. The only way to influence the position of the arm is to adjust the motor speed, so you can think of the algorithm as having a single output.

This situation has many similarities to the problem of quadcopter stability, but also a few key differences. The primary difference is that the arm can only rotate itself around the hinge in one direction, while a quadcopter can leverage motors on opposite sides of the body to have controlled rotation in every direction. To make this simple problem more applicable, I'll allow the motor to spin either way, providing thrust either up or down.

Instead of attempting to derive an optimal algorithm to achieve this goal based on the underlying physics, I'll go ahead and jump to a well-accepted solution: the PID controller. This is an algorithm that takes sensor measurements of the arm and returns a motorspeed. This process of turning sensor measurements into motor speed is typically done a few hundred times per second. How does a PID controller work? The best way to explain it is to explain the name:

        P = Proportional
    Adjust the motor speed proportionally to how far away the arm is to the target position. The farther the arm is below level, the higher the motor speed will be set to. If the arm is perfectly level, the proportional term has nothing to add.
        I = Integral
    Adjust the motor speed to account for any systematic difference between the arm and the target. If the arm tends to droop below the target position, increase the motor speed. The quantity used to determine this is the integrated difference between the arm position and the target position.
        D = Derivative
    If the arm is rapidly approaching the target, slow it down to avoid overshooting. The derivative of the difference between the arm position and the target position is used here.

These three components of the algorithm are computed at every time step and added together to come up with the 'optimal' motor speed that will get the arm to the right position. Each component has a tuning parameter that can increase or decrease their relative importance. Given proper tuning parameters, the PID controller can be a very effective method of stabilization.

Why does the PID controller work? How do these three components make sense? To answer these questions, we start by describing the stability problem with math. The governing equation for the arm rotating about the base hinge is Newton's Second Law for Rotation:

\[ \sum_i \mathscr{T}_i = I \ddot{\theta} \]

where $\mathscr{T}_i$ is a torque exerted on the arm about the hinge, $I$ is the moment of inertia (has to do with the mass of the arm and it's size), $\theta$ is the position/angle of the arm, and the double dots indicate the second derivative with respect to time. The left hand side is a sum over many torques because there will be multiple forces at work. In this language, our goal is to have $\theta = 0$, and we can apply a torque $\mathscr{T}$ using the motor and propeller to get us there. Gravity and wind can also apply torques, throwing the arm away from the solution we want.

The first torque to include is gravity, which exerts itself the most when the arm is horizontal:

\[ \mathscr{T}_{grav} = -g m L \mathrm{cos} \theta \]

Here, we approximate the arm and motor as a mass $m$ at a length $L$ away from the hinge. I could elaborate this model and go back to express the moment of inertia $I$ in terms of the mass and length, but it adds little to the discussion. The next source of torque is that exerted by the motor and propeller. If we use the PID controller, the torque applied is a function of the arm angle along with the derivative and the integral:

\[ \mathscr{T}_{PID} = f(\theta, \dot{\theta}, \int \! \theta dt) = P \theta + I \int_0^{\infty} \! \theta dt + D \dot{\theta} \]

Combining these pieces together, we get an equation describing the battle between gravity and the PID controller to stabilize the arm:

\[ I \ddot{\theta} = -g m L \mathrm{cos} \theta + P \theta + I \int_0^{\infty} \! \theta dt + D \dot{\theta} \]

The $\mathrm{cos} \theta$ term makes things a little complicated, but if we assume the arm won't deviate very far from horizontal, we can approximate this term as constant. Rearranging terms and collecting constants for readability, we end up with a nice textbook-looking 2nd order inhomogeneous integro-differential equation:

\[ a \ddot{\theta} + b \dot{\theta} + c \theta + d \int_0^{\infty} \! \theta dt = e \]

If there's one thing I've learned in my many years of schooling about integro-differential equations (and there was in fact only one thing I learned), it's that they are a pain and should be solved numerically. But before we give up on mathematical beauty and resort to number crunching, we can gain a bit of intuition for how the system acts under various conditions.

If we turned off the PID controller completely, we would end up with a very simple equation to solve. Unfortunately, this solution involves the arm rotating a significant ways away from horizontal, which breaks our earlier assumption. In that case, the equation we would have solved would no longer be valid. Instead, we will start by turning off the I and D components of the PID controller. With that, we are left with:

\[ a \ddot{\theta} + c \theta = e \]

This is a simple inhomogenous second order differential equation that has a correspondingly simple solution:

\[ \theta(t) = A \mathrm{cos}(\sqrt{c/a} t + \tau) + e/c \]

This is a simple harmonic oscillator. Depending on the initial conditions, the arm will bounce endlessly around some angle a little below horizontal. Plotting the arm angle as a function of time for some reasonable values of the various coefficients, we can see that this solution is not exactly optimal:

The next step in understanding the full PID solution is to use the P and D terms, but keep the I term off. This produces a similarly simple equation that can be solved using standard freshman-level differential equation solving methods:

\[ a \ddot{\theta} + b \dot{\theta} + c \theta = e \]

\[ \theta(t) = A e^{C_- t} + B e^{-C_+ t} + e/c \]
\[ C_{\pm} = \frac{\sqrt{b^2 - 4 a c} \pm b}{2 a} \]

While this solution might seem much more complicated than the previous, it is primarily because I have decided to express what could be a simple sine wave as a complex exponential. Plotting this solution for a particular set of initial conditions demonstrates the character:

Including the D term in the PID controller has helped damp out the oscillations from the previous example. A lot is known about damped oscillators, including the fact that certain values of the PID coefficients P and D will cause the system to be 'critically damped'. This means that the system will achieve a steady state as fast as possible. Below is a plot showing three different cases: under-damped, critically damped, and over-damped.

This version of the algorithm does a decent job at stabilizing the arm, but still tends to settle on an angle slightly below horizontal. To fix this, we have to turn back on the I component of the PID controller. The full integro-differential equation can be solved for certain values of the coefficients, but it's difficult to gain a fundamental understanding of the system by looking at the solution. Instead, it's better to reason out what the final term does.

By looking at the previous solutions, we can see that even when the arm was stable, it would settle at an angle below horizontal (the solid black line). If we were to sum up how far negative the arm was over time, the sum would continue to grow as long as the arm sat below horizontal. The purpose of the I component of the PID controller is to bump up the motor speed the more this sum of how far the arm is from horizontal. If this additional term happens to bring the arm back to horizontal, the sum of how far the arm has been below horizontal will stay at whatever value it was before the arm was restored to the proper position. This way, the I component describes an offset that only has to be determined once by looking at how the arm tends to settle. Comparing a properly tuned PID controller to the critically damped PD controller from before, we can see how the new component affects the solution.

Finding the values of the PID coefficients that give an optimal solution such as the one shown requires either knowing the mechanical properties of the system to a high precision, or a decent amount of guesswork. The usual method for finding the appropriate coefficients for a real world system is the latter. The procedure is similar to what I have done here: turn on one component of the PID controller at a time, tuning them independently until the solution is as good as possible without the next component.

2 - Solution to the Wrong Problem

To show how the PID controller is useful for stabilizing a system, I have had to simplify the problem and ignore various effects. This has allowed me to solve the governing equations and explain why the PID controller works, but my 'solution' is then only truly applicable to the idealized scenario. What happens to the PID controller when we add some complications, like wind or noise? Wind will act as a random torque on the arm in either direction at any time. The algorithm can't predict the wind, and it can't even measure the wind directly. All it can do it notice that the system has suddenly and unexpectedly gone off course and corrections need to be made. Any sensor used to measure the arm angle and inform the algorithm will have noise, a random variation between the true answer and the reported answer. How well can the PID controller handle incorrect information? Finally, My model has assumed that the algorithm can assimilate sensor measurements and adjust the thrust instantaneously. In the real world, there will be a delay between the arm moving and the sensor picking up on it, a delay due to the computation of the motor speed correction, and a delay between telling the motor to change speed and the propeller actually generating a different amount of force.

These three non-ideal effects (wind, noise, delay) are difficult to model mathematically. It's certainly possible to do so, but the amount of fundamental insight gained from such an analytic solution is limited. Instead, we can turn to a numerical simulation of our system. I've written a simple Javascript code that simulates an arm and motor just as I have describes in the equations above, but can also add wind, noise, and delay in varying amounts to the system. I've initialized the PID coefficients to a reasonably stable solution so you can press RUN and see what happens. Clicking the circles next to WIND, NOISE, and DELAY will increase the amount of each present in the system, and clicking the POKE button on the left will nudge the arm in a random direction to test the stability. Clicking the circles next to P, I, and D will toggle each component. The sliders next to them determine the coefficients, but will only apply to the system if the circle next to them is green. Pressing RESET will put the arm back at horizontal, but will keep the PID coefficients the same.

If you are feeling adventurous, try setting the PID coefficients randomly and then tuning the PID controller from stratch. If the arm just flails about randomly, press RESET to calm it down. The buttons next to PID and NN determine which controller is being used to stabilize the arm. I'll describe what the NN controller is doing in the next section.

Canvas not working!

3 - A Smarter Solution

The PID controller does a decent job at stabilizing the arm under certain conditions. If it gets pushed around too much by noise, gets too many flawed measurements, has too large of a delay, or simply ends up outside it's comfort zone, it tends to have trouble. Is there another option?

I, like many others, have recently finished up with Andrew Ng's Coursera course on Machine Learning. A significant fraction of my PhD research has utilized large nonlinear optimization codes, so I found most of the details of machine learning presented in the course to be pretty straightforward. The biggest concept I was able to take away from the course was that of an artificial neural network. There seems to be a lot of mystery and awe when talking about artificial neural networks in a casual setting, but I think this is largely due to the name. If I were to rename neural networks so something a little more down-to-earth, I would call them Arbitrary Nonlinear Function Approximators. Not nearly as magical sounding, but a little more to the point. But until I am the king of naming things, I'll call them neural networks.

What does a neural network have to do with stability? The PID controller was attempting to model the 'optimal' response to a set of inputs that would stabilize a rotating arm. We know that the PID controller is not a perfect solution, but it seems to have some ability to mimic a perfect solution in some cases. We might imagine that the truly optimal solution would be something far more complex than the PID controller, but we don't really have a way of knowing what that solution is. Not saying we can never know, it's just really hard to figure out. A neural network is a useful tool for approximating an unknown nonlinear function, as long as we have some examples of what the function looks like at various points.

A neural network works by taking a set of inputs, combining them back and forth in all kinds of nonlinear ways, and producing a final output. The way in which inputs are combined with each other can be varied incrementally, allowing the network to 'learn' how to appropriately combine them to match a specified output. Given enough examples of the function it needs to approximate, the neural network will eventually (and hopefully) converge on the right answer. This is of course a gross simplification of how a neural network works, but the point of this post is not to delve into the specifics of sigmoid functions and backpropagation. If you want to know more, use the internet! That's what I did.

So let's say a neural network could replace the PID controller. How do we train the network to know what the right answer is? The only example we have so far to learn from is the PID controller, but if that's the only information we give it, we might as well just use the PID controller and avoid the confusion the neural network will add. We need a way of letting the neural network controller (NNC) explore possibilities and recognize when it has come across a good part of the solution. To do this, we use reinforcement learning. Specifically, I've generally followed the procedure put forth in this paper by Hasselt and Wiering for using neural networks to perform reinforcement learning in a continuous action space. It was possible to do this with a discrete action space, but I wanted to generalize the problem for other projects.

With the knowledge that an artificial neural network can be used to approximate a complicated and unknown function given examples of the function for different inputs, how can we apply this to the problem of learning stability? The idea behind reinforcement learning is that you are training an 'actor' (the part of the algorithm that looks at inputs and decides on the optimal action) by giving it rewards or punishments. The reward is a single number that is used to communicate the idea of how good the previous action that the actor performed was. For example, you could reward the actor in this situation one point for every second that it keeps the arm near horizontal, but take away a hundred points if the arm falls down. The actor is continually building up an idea of the appropriate function that translates the current state of the arm into an optimal action through trial and error. At every step, the actor either picks the action it thinks is optimal based on it's current neural network or picks a random action close to the optimal guess. Since the neural network can only provide an approximate answer for the optimal action while learning is in progress, the randomness allows for exploration of new possibilities that may be better. After performing an action, an appropriate reward is returned. If this reward is better than what the actor anticipated getting for a given action, then the action taken is reinforced. There are a few other details to this method that I'm not mentioning, but if you are interested, start by reading up on Temporal Difference Learning.

Given enough attempts to find the optimal action to achieve stability, the artificial neural network controller will (hopefully) converge on a function that can take the same inputs given to the PID controller and output an optimal motor speed to keep the arm stable.

4 - Learning vs Knowing

I wrote up a simple code to perform reinforcement learning on a simulation of the arm using the FANN library to simplify my use of artificial neural networks. As inputs, I provided the same quantities that the PID controller uses (current angle, time derivative of angle, integral of angle) and had the desired motor speed as the only output. To initialize the NN controller, I preconditioned the actor by teaching it to approximate a linear combination of the inputs, resulting in a standard PID controller.

The following animation shows the progression of learning over many iterations. Since the actor starts without knowing anything about the appropriate action to take, it spends a lot of time completely failing to keep the arm anywhere near horizontal. As more simulations are run and the actor explores new possibilities, it picks up on the rewards reaped for keeping the arm in the right place and starts to make a concerted effort to keep the arm in line.

It's hard to quantify the performance of this new controller relative to the PID controller in a way that is simple to understand and appropriate for a post like this. Instead of presenting you, the reader, with a definitive answer for which controller is better, I am providing you with the ability to try out a neural network controller that I have trained. If you scroll back up to the PID controller app and click the NN button, you will switch the algorithm from PID to NN (neural network). See how well it can keep the arm horizontal under various conditions. The neural network controller isn't doing any learning in this app; it is only applying lessons it has been taught beforehand. I trained it on a simulation almost identical to the one in the app here, and ran it through about 10k learning iterations to settle on an answer. Running on a single core of my desktop, the training procedure took around 5 minutes.

As you may find, the new controller performs just as well as the PID controller at keeping the arm horizontal. When the PID controller is exposed to signal noise, it tends to amplify this noise in the output motor speed. The NN controller seems to react a little nicer to noise, allowing less noise on the final output. The NN controller also seems to be a little lighter on the trigger for correcting sudden jumps in arm angle. The PID controller will respond to a sudden change in state with a sudden change in output, while the NN controller seems to have a smoother response curve.

It's hard to determine exactly why the NN controller acts the way it does, other than to say it has found an optimal solution to the training data I provided. While there is some mystery around how to interpret the non-linear solution the controller has found, I hope this post has cleared up some of the mystery around how the controller operates and how it is able to 'learn'. In the end, it's all about twisting and turning a complicated function into something that best matches the training data.

So what does all of this mean for quadcopters? Should they all implement neural network stability algorithms with the ability to learn from current conditions? The process of learning is fairly computationally intense and can involve quite a few failures, so should they instead use pre-trained networks like in my app? I'm not sure. Neural networks are much more difficult to deal with than standard linear controllers. Not only are they harder to implement and tune, they are harder to understand. I believe that one of the greatest things about the growing quadcopter / UAV field is the lowering investment cost, both monetarily and intellectually. I'm not advocating for everyone to start replacing simple controllers that work pretty well with much more complicated ones that work slightly better, I would leave that to the researchers who are working out how best to implement such complicated systems. Instead, I advocate people using this field as a testbed for their own experiments. There are many levels of complexity to an unmanned vehicle, and a lot can be gained from picking one and thoroughly examining it.

Wednesday, April 30, 2014

Automated Investigator

It's time for a new project post. What have I been doing since I finished the LED Planet Project last month? Mostly the things I actually get paid to do, but also a decent amount of slacking off. Without a project that has been handed to me by someone else as a challenge, I've had time to consider what types of projects I want to do for this blog. A while back, I designed and programmed my own flight controller for a hexacopter platform, so I guess I could say something about that. Unfortunately, that was a fairly lengthy project, so organizing my thoughts on the subject will take some time. I'd also like to do a few new experiments with flight controllers, which will add to the complexity of the write-up.

So while I've been trudging along producing interesting content about flight, I figured I could use a little weekend project to have some fun. Just like my project from last November where I used an accelerometer to measure muscle activation, I decided to limit my supplies to only the hardware I had sitting around my apartment. At the very least, this requirement would help reduce the electronic clutter I've accumulated.

As for the goal of the project, I took a suggestion from my brother. He was brainstorming for an engineering project in college that needed to use physical sensing coupled with advanced computation on a computer. He had the idea of making a lie detector (polygraph), where a set of sensors monitor a human subject and a computer determines whether or not the subject is lying based on the sensor measurements. I decided this was an idea worth stealing. So I chose to make a lie detector.

Now, I don't want you thinking that I would try to one-up my own brother in his college engineering project just because I have the resources and free time. My plan was to boil his idea down to something much simpler yet still retaining the core concept, then post my work on the internet to become famous. The polygraph was invented in the 1920's, so neither of us are doing anything particularly original anyways. But if the last two years of internet trends tells me anything, it's that original content is not required to get pageviews.


How does a polygraph work? The core concept is that a subject under interrogation will have some sort of uncontrollable bodily response when lying. The validity of this statement is debated in the scientific community, but we won't let a bunch of boring scientists stop us here. If the subject believes that there will be some kind of negative repercussion if they to the investigator, their body is likely to act differently when telling the truth or lying. The reactions commonly associated with lying are an increased heart rate, sweating, changes in breathing pattern, eyes moving in a particular direction, etc. A polygraph will measure one or more of these effects in order to determine whether or not the subject has just told a lie.

Heart rate sensors are easy to come by, and almost as easy to make. The transmission of infrared light through your finger or earlobe is modified by the amount of blood being pushed through. If you were to shine infrared light from one side of your finger to an infrared detector placed on the other side, the detector would see a series of pulses in the amount of light detected. The rate of these pulses is your heart rate. Unfortunately, I didn't have any infrared LEDs sitting around, so I didn't use heart rate sensing for my lie detector.

Your body is a decent conductor of electricity, as long as you can get past the skin. Sweat is a good conductor, and can help reduce the electrical resistance of your skin. A skin conductivity sensor applies a positive and negative voltage to two parts of your body and measures the amount of current passing between those two points. The more current that gets through, the less resistance (more conductance) your body has, the more sweat you are assumed to have recently produced. The locations of the electrical contacts can vary, but a common choice is on two adjacent fingers. There isn't a whole lot of specialized equipment needed to make a skin conductivity sensor, so I decided to make one for my lie detector.

There are of course other bodily signals that can be monitored and loosely associated with the act of lying, but many require specialized equipment. For a DIY lie detector, these other methods require significantly more effort than a skin conductivity sensor requires, which is significantly more than I was willing to provide.


I decided to go with the two-finger approach to the skin conductivity sensor, where metal pads are attached to two adjacent fingers. I started by taping some wires to pieces of aluminum foil that could be wrapped around a finger.

Starting real simple here.

That was easy.

Metal conducts electricity. If you attach two metal pads to your fingers with the intention of measuring the resistance between the pads (your hand), you need to make sure the pads on each finger don't touch each other. The current that would be sent through your hand would instead prefer to travel directly between the two touching pads, causing the sensor to see an extremely high conductivity. This would throw off the lie detector and make it think your hand has suddenly turned into metal, which is probably a sign of lying. I needed a way of covering the metal pads to make sure they couldn't touch. I considered using medical tape to keep the pads on, but wanted a less permanent and hair-removing solution.

When I said that I was going to limit my building to only use parts I had lying around my apartment, I was telling the (polygraph approved) truth. What I didn't mention is that I have a 3D printer in my apartment and a few kilograms of printing plastic, so I wasn't really hurting for custom parts. I decided the best way to make sure the pads didn't touch was to print a custom finger-holder with the metal pads taped to the inside. I got to work on the design.

Definitely the hardest project I've done. Yep.

Perfect fit.

I attached the metal pads to the inside walls of each finger and started on the sensor electronics. Some basic resistance measurements using a multimeter tell me that I should expect the resistance between the two pads to be anywhere from 30 to 300 kΩ. Since I was able to vary the measured resistance by more than 50% just by moving the contacts around, I decided to forego any amplifiers in the sensing circuit and just use a voltage divider to get a 0 to 5V signal from the sensor that a simple Arduino controller could read. It may not be the most elegant or sophisticated sensor setup, but that's lie detectors for you.

Data Collection

With an Arduino reading the voltage from the sensor voltage divider, I recorded some data with the finger brace on and tried to manipulate the results with sheer willpower.

On the vertical axis of the previous plot, I've shown the resistance between the finger pads inferred from the voltage measured from the voltage divider. The arrows indicate instances when I attempted to elicit a skin conductivity response without significant movement. Ideally, I would have tested this by lying to an interrogator. But since I had no imposing figures around at the time, I attempted to trigger a response by thinking about stressful things like paying taxes or that dream I had where I showed up at middle school in just my underwear. The plot above shows that I was indeed able to trigger a response.

A stressful event seems to correspond to a decrease in skin resistance that persists long after the stressful event has passed. This is likely due to sweat that accumulates in small amounts underneath the sensor pads and cannot evaporate easily. To allow the lie detector to automatically determine whether or not a lie is being told, I put a condition on the first derivative of the skin. When the resistance dips down at a specified amount per second or more, the code decides I am lying.

Show/Hide Code

So I guess that's it. I set out to build a simple lie detector and I did it using some cheap parts and an Arduino controller. It took maybe 3 hours of work, plus an hour or so of printing time for the fancy finger brace. I've basically taken a project that has been done quite a few times already on the internet and added a dash of 3D printing to make it trendy. Is this really something that deserves a write-up? No, I can do better.

Good Cop, Bad Cop

TV and movies have taught me that anytime a suspect is interrogated, there needs to be two people asking questions. One plays the part of the bad cop who tries to break the suspect's resolve and get to the bottom of things as quickly as possible. The other is the good cop who offers the suspect salvation from the bad cop if they would only give up the truth. In order for my lie detector to be an effective mechanism for divining the truth, it needs a set of interrogators.

But before I dive headfirst into building a pair of human-sized bipedal robots with mustaches, I should try to boil down the idea of the two interrogators to their basic concepts. For example, boiling someone down into their constituent parts might be something the bad cop threatens to do to the suspect. They don't necessarily need to be strong with words, just strong with their hands. In fact, the only thing the bad cop really needs is a hand. To slap people with.

To build the bad cop, I needed a motor, a motor driver, an arm, and a hand. I happened to have a set of decently powerful motors and an accompanying motor driver from an older project of mine where I tried to motorize a large wooden cart. The project was a failure, but I see no need for the motors to live in shame forever. I adapted the old Arduino controller I had used for that project to simultaneously control a motor and read skin conductivity measurements.

The driver could supply enough voltage and current to the motor to make it swing around at a good slapping speed, but I needed a way of attaching an arm. While I've had good luck finding a few motor adapters online to attach wheels and such to motors, I have yet to see a motor adapter for a slapper-arm. So I designed and printed one myself. The arm would be made of two thin plastic rods I had sitting around, so the adapter needed to have two mounting holes for the arm and a set of screw holes for the motor drive shaft.

The strange V-shape is partially due to the limitations of 3D printing. Prints need to be fairly simple, strong, and preferably without any significant overhangs. The two plastic rods for the arm slide into the sets of three holes in the vertical parts, and screws are used to secure the bottom to an existing metal motor adapter.

The wonders of 3D printing.

I then glued the plastic rods in the adapter and glued a glove filled with cotton balls at the end of the rods. Ideally the glove should be filled with sand or rocks or something, but this was just a prototype. I would attach the motor to the wooden frame I made for my LED Planet project, since I'm all about reusing materials. Also, I didn't want to spend time or money on this.

All it needs now is an awesome mustache.

The programming for the bad cop was fairly simple. It would monitor the suspect, wait for a lie to be detected, then slap the suspect. After waiting a second or two, it would repeat this process forever. Testing the speed and intensity of the slapper was easy since it doubled as a high-fiving machine.

Thoughtful Questions

The next step was to create the good cop. This half of the interrogation team needed to offer questions to the suspect in a non-accusing way. It didn't need to sound threatening, as the threat came from a glove being swung around on a wooden frame. Not only did it need a calm, unwavering voice to question the suspect, it also needed to come up with appropriate questions to ask. To accomplish these tasks, I let my laptop play the part of the good cop. It had the processing capacity to form sentences and play them through the built-in speakers. You can't make a powerful investigative duo without subtle non-verbal language, so I used a USB-to-FTDI cable to allow the good cop and bad cop to communicate.

The text-to-speech was done using the free and open source tool eSpeak. In a unix terminal, turning a string of text immediately into audio that plays through your default audio channel is simple:

> espeak "Hello world."

There is an API for eSpeak, so commands like this can be translated into C code. Unfortunately, it's so much easier to just do a system() call and be done with it:

With that, my good cop code could talk in a calm and controlled voice to the suspect. To help placate the suspect, I added a simple visual to the code.

The final step in constructing the good cop program was to give it a method for generating appropriate questions. First, I hand-coded a couple softball questions that the good cop could start out with just to get a baseline reading for the polygraph. Then, I handed the script off to my girlfriend who wrote a few more lines that I couldn't read beforehand. If I knew when the harder questions were coming and what they were, I might have been able to skew the polygraph readings in my favor. I wanted a truly authentic experience, so surprise was important.

Suspects need alibis. The interrogators need to ask for an alibi. It seems common to ask the suspect about their actions on a particular date, so I added an automatic alibi-asker. It picks a random date within some range of years and forms a sentence that asks the suspect about the date.

I'm not entirely sure what instance the interrogation team would be referring to, but you can't go wrong with a random number generator asking the questions. Next was to add a few more human-generated questions that I didn't have to come up with. I needed a large repository of good, thoughtful questions, and the answer was obvious: Yahoo Answers. This is a site that allows users to pose questions to the public and rate submitted answers. It's full of all kinds of wonderful questions, some of which are collected in aggregate sites. I wrote another block of code to load a specified page of Yahoo Answers, parse through it for questions, then ask them sequentially.

Show/Hide Code

With all of the questions queued up, I positioned the good and bad cops for optimal interrogative power. I bravely volunteered myself for questioning by my own creation and recorded the results:

As you can clearly see, the full interrogation system worked perfectly. The skin conductivity sensor provided accurate measurements for the low-level polygraph hardware and it was able to trigger the slapping response at appropriate times.

I think this project has shown that an effective polygraph can be constructed using everyday materials like aluminum foil and 3D printers. I fully support any hobbyist following my lead and using their home-made polygraph to seek out the truth. Just remember, a useful lie detector relies on the understood and sometimes fully-realized threat of physical or emotional harm.

Wednesday, March 19, 2014

LED Planet Software

In my last post, I discussed the process of building a spiral-sphere of LEDs that could be controlled from my laptop. The purpose was to create visuals for each movement of Gustav Holst's The Planets for an upcoming concert of the Boulder Symphony Orchestra. I wrote about the math that went into the design, the process of 3D printing the structure in many pieces, and powering the LED strip that was glued to the outside.

The work required to get this sphere ready for filming did not end there. In fact, building the sphere was fairly simple compared to what followed. While the sphere satisfied all of the physical requirements for the film, there was still a set of requirements on how the sphere would act. I've said that the LEDs would be controlled from my laptop, but in more concrete terms, the list of requirements was:
  • Independent control over each LED from laptop
  • Full-sphere refresh rate of at least 30 Hz
  • Pre-coded, modifiable effects (pulsing, twinkling, etc)
  • Image and video projection capabilities
  • Effect mixing
  • Keyframed effect attributes
Each task on its own was a formidable challenge. Implementing all of these together would require code written on multiple platforms, from a tiny 8-bit microcontroller running at 16 MHz to my workstation-laptop with a 64-bit quad-core processor running at 2.4 GHz. Luckily, every platform I had to work with could be programmed with C, so I could stick to one language throughout the whole project. While the entire code for this post (as well as my last 3 posts) is hosted on github, it's an ugly beast of a code. I'll try to include relevant code examples that have been isolated from the main code and cleaned up for legibility.

I considered splitting this software post into multiple sub-posts talking about the different levels of software I had to write to get everything to work. Instead of doing this, I've decided to lump it all into this post. I figure that it is the collaborative effort of all of these levels that make the software side of this project interesting. But to help organize this post, I've split it up into 5 sections:

Part 1 - Driving the LED Strip
Part 2 - Data Transfer
Part 3 - Parallel Protocol
Part 4 - Simple Effects
Part 5 - Sequenced Effects

As I step through the different levels of the software, I will be talking about how long it takes to run certain segments of code. Since I had a target frame rate that I needed to hit, timing was of utmost importance. A frame rate of 30 Hz means that every bit of software has to work together to produce a new frame in less than 33 milliseconds. To get across the idea of how long various bits of code take relative to 33 ms, I'll be using a basic timing diagram:

The idea here is to visualize how long different blocks of code take by displaying them as, well, blocks. The horizontal bar that contains the red, yellow, orange, and white blocks represents what a given processor is doing. As time passes from left to right, the processor performs different tasks, each colored differently. The lower bar that remains the same color represents the LED strip. It does not do any appreciable data processing, so I'll treat it as a passive item. The vertical arrows from the processor to the strip indicate the flow of data, and are positioned in the center of a code block that handles the transfer. Finally, the vertical lines in the background give the time-scale of things. I have placed these 33 ms apart, since that is our target update time. Their spacing on the screen may change between images as I zoom in and out of different features, but they will always represent the same time step. The goal is to transfer new data to the LED strip at least every 33 ms in order to get an update rate of 30 Hz, so in this kind of diagram we want to see one of those black arrows occur at least as often as the background lines. In the example above, the processor serving data to the LED strip is not fast enough.

I've measured the time needed to run various blocks of code by hooking up a Salea Logic probe to a pin of the relevant computing platform and triggering responses by inserting commands to toggle the pin to a 1 or 0 in the code. In most instances, the toggling takes around 0.0001 milliseconds. Since I'll probably be rounding most timing figures to the nearest millisecond, I've treated the toggling as instantaneous.

Part 1 - Driving the LED Strip

As with many of my projects, I used a flexible strip of WS2812s as my LEDs. These are surface-mount RGB LEDs with on-board memory and driving. Data for what color the LED should be is shifted down the strip to each LED through a single line. The datasheet for these LEDs gives details about how data must be formatted in order to properly communicate with them. The key to reliable data transfer to the strip is precise timing. Luckily, there are a few freely-available libraries on the internet that handle this timing. I happen to prefer the Adafruit Neopixel library, but I have heard that there are others just as good. This library includes sections of Assembly code that are hand-tuned for various architectures and processor speeds, all intended for use on an Arduino-compatible microcontroller. Since I'm just barely comfortable reading Assembly, I'm glad someone else has taken the time to work out this library for others.

For testing, I used my workhorse Arduino Mega. It runs at 16 MHz and has plenty of memory and 5V digital I/O pins. The interface for using the library is very simple:

Show/Hide Code

To reach the target frame rate, I didn't need to be concerned with the time it took to initialize things. All that mattered was how often I could call Since the timing of the data transfer is determined by the strip, and not the processor sending data, updating 233 pixels takes roughly 7 ms. In my timing diagram, the above code looks like:

If all we want to do is keep telling the strip to update, we can get a frame rate of over 130 Hz. Unfortunately, this doesn't allow for changing any of the LED colors ever. Hardly seems worth the frame rate if the image never changes. What we want is for a separate computer to do the complicated calculations that determine what image should be displayed, then the computer sends the LED values to the Arduino Mega via USB. The data is parsed on the Mega and then the LED strip is updated.

Part 2 - Data Transfer

I wrote a short C code that ran on my computer, calculated some LED values, and sent the values out to the Mega. It packaged the information for 233 LEDs in a 2048-byte packet, 5 bytes per LED and some padding on the end. The color for any given LED is specified by 3 bytes, but I included a 'pixel start' byte and a 'pixel index' byte to reduce visual glitches. The USB specification indicates that rounding up to the nearest 64 bytes for my package would have worked, but for some reason the nearest 1024 worked better. The Mega ran a code that waited for incoming serial data from the USB to serial converter, copied the incoming data to a processing buffer, parsed through the buffer to set LED values, then updated the LED strip. The additional code (neglecting some of the elements from the last snippet for brevity) looks like this:

Show/Hide Code

On my laptop, I wrote a short code to begin serial communications, package LED color values into a 2048-byte packet, and send them along. The code to open the serial port in Linux was ripped from the example code here:

Show/Hide Code

The data packaging code assumes it is given a struct called a 'strip' that contains arrays of length NUMPIXELS for the red, green, and blue channels:

Show/Hide Code

The timing diagram for the Mega code and laptop code working together is as follows:

I've added a new bar to show what my laptop was doing. The blue blocks are the USB transfer and the white blocks are delays introduced to keep things synchronized. On the Mega, the orange block still shows updating the LED strip, the yellow block is parsing the data buffer, and the red block is receiving the USB data. As you can see, the data transfer rate is slow compared to the background vertical lines. My Mega was only able to copy data in at around 55 kB/s, resulting in almost 38 ms for just the data transfer. This is close to the peak transfer speeds found in others' experiments. My data parsing code (yellow) also took around 20 ms. Not enough to push the frame rate past 33 ms by itself, but it certainly wasn't helping. At this point, I was updating the strip at 15 Hz. Too slow by a factor of two.

Luckily, I had a solution. If the Mega was too slow, I just had to use a faster microcontroller! I happened to have a few Teensy 3.1s fitting around, which boast USB transfer rates up to 20x faster than the Mega. The Teensy is Arduino-compatible, but uses a slightly different library for driving LED strips such as the one I was working with. Still, the interface was basically the same and it took the same amount of time to update the strip (again, determined by the strip, not the controller). I ported my Mega code over to the Teensy and ran it with the laptop still supplying USB data.

Blindingly fast! Well, at least compared to the sluggish Mega. The code on the Teensy is similar to that running on the Mega, but colored differently. Lime is the strip update, green is the data parsing, and light blue is the data receive. With the higher performance USB communication and 96 MHz clock (versus 16 MHz on the Mega), both the data transfer and the data parsing have sped up immensely. The frame rate here is about 77 Hz, more than twice what I need.

So that's it, right? I can run at 77 Hz, assuming I can get my laptop to churn out a new frame every 13 ms. Unfortunately, no. The Teensy runs at 3.3V as opposed to the Mega's 5V, so the data signal to update the LED strip needs to be buffered. I used an 74HTC245 octal bus transceiver as suggested by this site to bring the 3.3V digital signal up to 5V. It didn't work. For some reason, the LED strip did not like this voltage level. The only way I could get the strip to accept data was to drop the power supply 5V line to around 4.5V, but this was not something I could do continuously due to the nature of my power supply. Without an oscilloscope, it was nearly impossible to determine what the quality of the data line was like. It was entirely possible that the transceiver was giving me a crummy signal, but I had no way of knowing for sure. I was under a strict deadline to get the entire LED system up and running two days after this unfortunate discovery, so I had to consider my options:

 - the Mega could drive the strip, but couldn't get data fast enough
 - the Teensy could get data quickly, but couldn't drive the strip (for unknown reasons)

The answer was suddenly clear: use both! The Teensy would receive data from the laptop, pass it to the Mega, and the Mega would update the strip. The only catch was that I needed a way of passing data from the Teensy to the Mega faster than the Mega could handle USB data.

Part 3 - Parallel Protocol

Let's start by looking at how you can read in data manually to the Mega. Using some bits of non-Arduino AVR-C, the code to quickly read in a single bit of data from a digital I/O pin is:

Show/Hide Code

Not bad for two lines of code. The port values (DDRA, PINA) are found in tables listing the internal registers of the Mega and the Arduino pin mappings. I use these bitwise commands instead of the standard Arduino digitalRead() in order to speed up the process of reading pin values. The above code isn't too useful, because the data being read in one bit at a time is being written to the same place in memory before the previous bit is stored anywhere. But before I go into how to manage data as it arrives, we can look at how to speed up the existing code that reads in data. You may think, how can you possibly speed up a single line of code that probably only takes a handful of clock cycles? Easy:

Show/Hide Code

By removing some of the code, we've sped up data input by a factor of 8. What exactly did I do? I removed the mask that blocked out the other 7 pins available on PORTA, allowing a mapping of 8 digital I/O pins to 8 bits of internal memory. Now the code will read in 8 bits simultaneously every time that line of code is executed. Not a bad improvement. How does this compare to the serial USB communication from before? The average data rate I could get before was 55 KB/s, or roughly 284 clock cycles of the Mega for each byte. The act of grabbing a whole byte of data using this new method takes only a single clock cycle, leading to an impressive 15 MB/s. This is of course, completely unrealistic. The Mega doesn't only have to read the value of the port, but also store the value somewhere in memory. If I want to do something useful with the incoming data, the Mega also has to make sure each new byte of data gets stored in a unique location. Then there is the issue of synchronization with whatever is supplying the incoming data. You wouldn't want to miss an incoming byte of data, or even read the same byte twice before it is updated. Solving each of these issues creates overhead that will drastically slow down the data rate. The hope is that even with appropriate data management and synchronization, this method is still faster than using serial communication.

To manage where each new byte of data goes in the Mega, I used a 2048-byte buffer that could store the entire package sent for each update for the LED strip. As each byte is read in, it is placed in the buffer, and a counter is incremented to keep track of where in the buffer the next byte should go. To handle synchronization, I added two more data pins to the existing 8 pins of this parallel protocol. One pin is the Mega-Ready pin (RDY), which signals to the data source that it is prepared to accept data. The other new pin is the Data-Clock pin (CLK), which the data source uses to tell the Mega that a new byte has been made available and that it should read it in. I used an interrupt on the Mega triggered by the CLK signal to read in a byte. The Mega code to read in data and process it once there is enough looks like this:

Show/Hide Code

I found that the process of jumping to the interrupt routine and running the code within took around 6 microseconds. Assuming the data source can instantaneously provide a new byte once the Mega has finished reading the previous one, this allows for an overall data rate of 162 KB/s. Not nearly as good as the overly-ideal 15 MB/s, but much better (3x) than the original serial 55 KB/s.

Teensy on left, Mega on right. 8 data lines plus RDY and CLK.

The data source for the Mega is the Teensy controller. Since it can handle fast communication with the laptop and has a faster clock speed than the Mega, it handles reading in USB data and splitting each byte up into one bit per pin of the Mega.

Show/Hide Code

Going back to the timing diagrams, we can look at how the laptop, Teensy, and Mega work together to pass data along to the LED strip in an optimal way:

Starting with the top bar, the laptop spends its time computing what color each LED should be (black), sending the 2048-byte packet over USB to the Teensy (purple), and sitting around waiting for everyone else to catch up (white). Next, the Teensy waits for data from the laptop (white), receives and stores the data (blue), then starts arranging the data and sending it on to the Mega using the parallel protocol (green). The Mega receives the parallel data from the Teensy (red), parses it into LED values (yellow), then updates the LED strip (orange). The most important part of this figure is that by using the parallel protocol, the overall time taken to update the LED strip is less than the 33 ms vertical lines. This means the entire system can finally run at faster than the initial goal of 30 Hz.

It's interesting to see the relative time it takes to do various tasks on different computing platforms. The laptop can perform all sorts of complicated mathematical operations to determine what pattern will appear on the sphere in less time than it takes the Mega to just pass on the LED values to the strip. It's also neat to see how each platform independently handles its own share of the work, then joins with another platform to ensure a steady flow of data.

And with that, I had a reliable way of updating every LED on the spiral-sphere at right above 30 Hz. While it may not have been the simplest solution out there, I ended up very fond of this method. It was an excellent exercise in simple data transfer protocols and the limitations of different hardware platforms. For the rest of this post, I will move past the Mega+Teensy hardware and focus only on what the laptop has to do to produce the desired LED colors at each timestep.

Part 4 - Simple Effects

With the communication protocol worked out between each computing platform, my laptop had full control over the color of every LED and could update every value at 30 Hz. All it had to do was produce a properly-formatted 2048-byte packet containing the color values for each LED and send it off to the Teensy once every 33 ms. The next step in this project was to create a series of 'effects' that could be executed on the laptop and would determine what color values to send along at every update. These effects would later be the building blocks of the final visual product, so I needed a fairly general set that could be combined later.

I knew that while most effects would need a common set of parameters to determine their appearance (color, fade), they would often need their own specialized parameters based on the effect (pulsing rate, image location, etc). Instead of coding each effect to have a unique interface, I created a 'handle' struct in the code:

Show/Hide Code

The handle contained all of the parameters that any effect could need. It was then up the the individual effect to use the different values within a handle appropriately. The same handle could be applied to multiple effects, although I rarely did this. Simpler effects would use the fade and color variables, while more complicated ones would use the attr and file arrays for other effect attributes. The 'effect' struct contains miscellaneous variables related to a single effect (start time, which handle to use, unique pixel buffer, etc):

Show/Hide Code

When an effect is run, the appropriate function is called at every time step and passed an effect struct, a handle struct, and the current time so that the effect can know how far along it is.With this fairly straightforward interface worked out, I started creating effects.

Effect: Solid
Show/Hide Code

Set every LED to the same value based on color1 and fade.

Effect: Image
Show/Hide Code

Use projection mapping to display an image from file. The projection method (planar, cylindrical, spherical) and the projection direction are specified in attr.

Effect: Pulse
Show/Hide Code

Smoothly pulse the value of every LED between color1 and color2 at a rate determined by attr.

Effect: Circle
Show/Hide Code

Set LEDs within a certain distance of a specified location on the sphere to color1. The location and size of the circle are specified in attr, as well as a tapering parameter. This effect loads information on the 3D location of each LED from elsewhere.

Effect: Flicker
Show/Hide Code

Every time step, tells some number of LEDs to begin lighting up to either color1 or color2. The number of LEDs to light up per second, the fade in time, and the fade out time are all specified in attr. I've used a Poisson distribution to calculate how many LEDs to light up after every time step.

Effect: Packet
Show/Hide Code

Send a packet of light up or down the LED strip, ignoring the spherical geometry. The values in attr give the packet velocity, origin, and width.

Effect: Ring
Show/Hide Code

Given a start and end LED, lights up LEDs between them with a travelling wave of color. The start LED, end LED, wave speed, and wave length are specified in attr.

I wrote three more effects that were mostly variations on the ones above. There is a Marching effect that is similar to Flicker, but synchronizes the LEDs into two groups instead of letting each be random. The Video effect uses the same projection mapping as Image, but projects a frame of a video based on the elapsed time of the effect. The Single-Line Packet is a slight variation on Packet, where the packet of light is constrained to move along a single loop of the spiral-sphere. These effects were all introduced with the specific intention of using them in in the final video production.

As mentioned at the beginning of this section, each effect that needs to be running is called upon at each time step. If multiple effects are called, their unique pixel buffers are added together (additive blending) to create the final frame that is sent out to the electronics. For the Image and Video effects, I had the code pre-load the appropriate files before starting into the main loop. This reduced the amount of time spent calculating these effects during playback.

Part 5 - Sequenced Effects

With a complete set of effects written up, the next step of this project was to sort out how to create a sequence of effects that could be run as a unit. I not only needed multiple effects to run simultaneously, but to start and stop independent of each other. I also needed to be able to change the attributes within each handle as the sequence progressed. The simplest example of this would be a fade, where the fade attribute in a handle varies smoothly from 0 to 1 over the course of a few seconds, causing an effect to fade into view.

Ideally, I wanted key-framing. Every attribute of every handle would be able to change at any moment, and the changes would be defined by a set of key frames. Effects would be launched at specified times and produce visuals based on these key framed attributes. Since I had two days to set up the framework for this, I went with text-based key framing instead of a GUI. This required creating a 'language' that my code would interpret and turn into key frames. To build the syntax for this language, I wrote out some psuedo-code for how I wanted it to work:

# allow comments starting with a "#"
at 0:00, set fade of handle 1 to 0.0
at 0:00, set color1 of handle 1 to (255, 0, 0)
at 0:00, launch effect "solid" using handle 1
at 0:10, set fade of handle 1 to 1.0

The above example would create a handle with color1 set to red and fade set to 0. The effect Solid would be linked to this handle and launched at the beginning, but since the fade would be set to 0, it would not appear. Over the first ten seconds of the sequence, the fade would gradually increase until it equaled 1.0 (full on) at the end.

This example shows the two main commands I needed in my language; adding a key frame to an attribute of a handle and launching an effect linked to a handle. Shortening the above example to the final syntax I chose:

# comments still work
0:00 h1 fade 0.0
0:00 h1 color1 255 0 0
0:00 effect solid 1
0:10 h1 fade 1.0

Writing the code to parse these commands was fairly straightforward. Lines in a sequence file would start either with a #, indicating a comment, or a time stamp. In the latter case, the next bit of text after the space would either be "effect", indicating the launching of an effect at that time stamp, or "h*", indicating the addition of a key frame for an attribute of a handle. Since string parsing is an ugly matter in C, I'll save you the burden of looking at the code I wrote for this.

An array of empty handles and effects would be created at the start of the main program and filled with information as the sequence file was parsed. To keep track of key frames, the handle struct was modified to include arrays of keys:

Show/Hide Code

At each time step during the execution of the sequence, every attribute of every handle would be updated to reflect the appropriate value based on the nearest key frames:

Show/Hide Code

While this might not be the most efficient method of key framing, I only ever dealt with around 30 handles and key frames. If I had a need for hundreds or thousands, I would have worked some on optimizing this algorithm. I wrote up a quick testing sequence and recorded the result:

Show/Hide Code

Finally, with all of the effects playing nicely inside the sequencer, I began writing up the sequences for each movement of The Planets. I began by translating storyboards made by my collaborator on this project, then filled in gaps and introduced fades between different effects. After about 20 hours of translating pictures to text, I had the sequences written down in my made-up language:

In retrospect, maybe a GUI would have been worth an extra few days of work..

For filming the sphere in action, I would tell the main code to execute one of the sequences. It would pre-load any images and videos, then wait for my command to begin playback. With fabric draped around the sphere and cameras rolling, I would tell the code to play through the sequence. The visuals were completely determined by the written sequences, so we could do multiple takes and get identical results. I could start the sequences at any time stamp and fast-forward through the playback to get a quick preview before we committed to anything.

With around 55 minutes of content to produce, filming took quite a while. Writing sequences for the first time was a burden, but going through and editing them during the filming process was a nightmare. Even answering questions like "why is the sphere blue at 4:31" was difficult due to the nature of the sequencing language. The text-based key framing was a simple solution to a problem I had to solve in a short amount of time, but spending an extra few days implementing a graphical interface would have saved me quite a few headaches.

When the final full-length video is posted online, I will link to it here. For now, I can give you two still shots.

Venus, Bringer of Peace 

In all, I'm happy with the way this project turned out. Some of the technical bits may have been a little rough around the edges, but I've learned that's what happens when you have only a few weekends to get things working and an ideal budget of $0.

Uranus, the Magician

While the Planets project is over, the LED sphere and accompanying electronics are still fully-functional. I'll have to see if I can incorporate them in some other personal project down the road. It might even make for a nice hanging lamp in my apartment.