Skip to main content

The Kalman Filter. Helping Chickens Cross the Road.

 https://mathvoices.ams.org/featurecolumn/2022/02/01/the-kalman-filter-helping-chickens-cross-the-road/

 

 

The Kalman Filter. Helping Chickens Cross the Road.

David Austin
Grand Valley State University

I was running some errands recently when I spied the Chicken Robot ambling down the sidewalk, then veering into the bike lane. A local grocery chain needs to transport rotisserie chickens from one of its larger stores to its downtown market. Their solution? A semi-autonomous delivery vehicle that makes the three-mile journey navigating with GPS.

I was curious how this worked. The locations provided by most GPS systems are only accurate to within a meter or so. How could those juicy chickens know how to stay so carefully in the center of the bike lane?

A typical solution to problems like this is an elegant algorithm, known as Kalman filtering, that’s embedded in a wide range of technology that most of us frequently either use or benefit from in some way. The algorithm allows us to efficiently combine our expectations about the state of a system, the chickens’ physical location, for instance, with imperfect measurements about the system to develop a highly accurate picture of the system.

This column will describe some simple versions of Kalman filtering, the main observation that makes it work, and why it’s such a great idea.


A first example

Let’s begin with a simple example. Suppose we would like to determine the weight of some object. Of course, any scale is imperfect so we might weigh it repeatedly on the same scale and find the following weights, perhaps in grams.

50.4 46.0 46.1 45.1 48.9 42.6 50.7 45.7 47.8 46.7

As the result of each weighing is recorded, we update our estimate of the weight as the average of all the measurements we’ve seen so far. That is, after obtaining

, the weight, we could estimate the object’s weight by finding the average

The result is shown below.

As the next weight

is recorded, we update our estimate:
This expression tells us how each new measurement influences the next estimate. More specifically, the new estimate is obtained from the previous estimate by adding in . Notice that this term is proportional to the difference between the new measurement and the previous estimate with the proportionality constant being . We call the Kalman gain; the fact that it decreases as we include more measurements reflects our increasing confidence in the estimates so that, as

increases, new measurements have less effect on the updated estimates.

Suppose now that we have some additional information about the accuracy of our measurements

. For example, if is the chickens’ location reported by a GPS system, the true location is most likely within a meter or so. More generally, we might imagine that the true value is normally distributed about with standard deviation

.

Returning to our weighings, we could perhaps estimate

by repeatedly weighing an object of known weight. The probability that the true value is near is given by the distribution
Let’s also suppose that we have some estimate of the uncertainty of our estimates . In particular, we’ll assume the probability that the true value is near is given by the normal distribution having mean and standard deviation :

We’ll describe how we can estimate

a bit later.

These distributions could be as shown below.

Now here’s the beautiful idea that underlies Kalman’s filtering algorithm. Both distributions

and describe the location of the true value we seek. We can combine them to obtain a better estimate of the true value by multiplying them. That is, we combine or “fuse” these two distributions using the product

Since the product of two Gaussians is a new Gaussian, we obtain, after normalizing, a new normal distribution that better reflects the true value. In our example, the product

is shown in green.

More specifically, the product of the two distributions is


Expanding the quadratics in the exponents and recombining leads to the new normal distribution

where

Notice that this expression is similar to the one we found when we updated our estimates by simply taking the average of all the measurements we have seen up to this point. The Kalman gain is now
In addition, the product of the Gaussians leads to the new standard deviation
Notice that the Kalman gain satisfies . In the case that , we feel much more confident in our estimate than our new measurement . Therefore, and so

, reflecting the fact that the new measurement has little influence on our next estimate.

However, if

, we feel much more confident in our new measurement than in our estimate . Then and

.

In both cases, the uncertainty in our new estimate, as measured by

, has decreased.

This leads to the following algorithm:

  1. Initialize the first estimate

and the uncertainty arrives, update the estimate nd uncertainty

as follows:

  • Find the Kalman gain

. Since
    • , the uncertainty will continually decrease, which makes the algorithm relatively insensitive to the initialization.

To illustrate, suppose that we make many measurements of a known weight with our scale and determine that the standard deviation of its measurements is constant

. Initializing so that and

, we have the following sequence of estimates along with the decreasing
sequence of uncertainties.


Tracking a dynamic system

In our first example, the quantity we’re tracking, the weight of some object, doesn’t change. Let’s now consider a dynamic process where the quantities are changing. For instance, suppose we’re tracking the height and vertical velocity of a moving object. The true height of the object could look like this, although
that true height is not known to us.

In addition to recording the height

, we will also track the vertical velocity so that the state of the object at some time is given by the vector


The reason for writing the subscript in this particular way will become clear momentarily.

Moreover, we will assume that there is some uncertainty in both the position and the velocity. Initially, these uncertainties may be uncorrelated with one another


so that we arrive at a Gaussian


that describes the multivariate normal distribution of the true state. A more convenient expression for this Gaussian uses the covariance matrix

so that the Gaussian is defined in terms of the quadratic form associated to :


The following figure represents the value of this distribution by shading more likely regions more darkly.

Now if we know

, the state at some time, we can predict the state at a later time using some assumption about the system. For instance, we might assume, after time units have passed, that the new state is where

That is, we assume that the height has increased at the constant velocity given by the state vector and that the velocity has remained constant. More succinctly, if

we can write

The subscript

reflects the fact that we are extrapolating the next state using information only from the previous state.

This is a particular model we are choosing; in other situations, another model may be more appropriate. For instance, if we have information about the object’s acceleration, we may want to incorporate it. In any case, we assume there is a matrix

from which we extrapolate the next state:
Of course, uncertainty in the state will translate into uncertainty in the extrapolated state . It is straightforward to verify that the covariance matrix is transformed as


For instance, if the uncertainties in position and velocity are initially uncorrelated, they may become correlated after the transformation, which is to be expected.


These two transformations form the extrapolation phase of the algorithm:
Next, suppose we have a new measurement whose uncertainty is described by the covariance matrix . We imagine that the normal distribution centered on and with covariance matrix

describes the distribution of the true state.

As before, we will fuse our predicted state

with the measured state by multiplying the two normal distributions and rewriting as a single Gaussian. With some work, one finds the expression for the Kalman gain

which should be compared to our earlier expression.

We also obtain the updated state and covariance matrix

So now we arrive at a new version of the algorithm:

  1. Initialize the initial state

and covariance matrix

Use the result to find the Kalman gain

Use the Kalman gain to fuse the predicted state with the measured state and update the covariance matrix.

Let’s see how this plays out in an example. Imagine we are tracking the height and vertical velocity of an object whose true height as is shown.

Remember that we don’t know the true height, but we do have some noisy measurements that reflect the object’s height and its velocity.

Applying the Kalman filtering algorithm naively, we obtain the green curve describing the object’s height. Notice how there is a time lag between the filtered height and the true height.

What’s the problem here? Clearly, the object is experiencing a significant amount of acceleration, which is not built into the model. As we saw in our earlier static example, the uncertainty in the estimated state

continually decreases, which means the confidence we have in our estimates grows and causes us to downplay the new measurements.

There are several ways to deal with this. If we have measurements about the acceleration, we could rebuild our model so that the state vector

includes the acceleration in addition to the position and velocity. Alternatively, if we have information about how an operator is controlling the object we’re tracking, we could build it into the extrapolation phase using the update

where is a vector describing some additional control and

is a matrix describing how this control feeds into the extrapolated state. Clearly, we need to know more about the system to incorporate a term like this.

Finally, we can simply build additional uncertainty into the model by adding it into the extrapolation phase. For instance, we could define


where is a covariance matrix known as process noise and represents our way of saying there are additional influences not incorporated in the extrapolation model:

.

Adding some process noise into the extrapolation phase prevents us from becoming overly confident in the extrapolated states and continuing to give sufficient weight to new measurements. This leads to the filtered state as shown below, which is clearly much better than the measured signal.



Summary

Kalman developed this algorithm in 1960, though it seems to have appeared earlier in other guises, and found a significant early use in the Apollo guidance computer. Indeed, the algorithm is well suited for this application. While the guidance computer was a marvel of both hardware and software engineering, its memory and processing power were modest by our current standards. As the algorithm only relies on our current estimate of the state, its demands on memory are slight, and the computational complexity is similarly small.

In addition to being fast and efficient, the algorithm is also optimal in the sense that, under certain assumptions on the system being modeled, the algorithm has the smallest possible expected error obtained from a given set of measurements.

Kalman filtering is now ubiquitous in navigation and guidance applications. In fact, it is used to smooth the motion of computer trackpads so you may have used it while reading this article. If you are driving while navigating with an app like Google Maps, you may notice the effect of the algorithm when you come to a stop at a traffic light. It sometimes happens that your location continues with constant speed into the intersection and then quickly snaps back to your actual location. The extrapolation phase of the algorithm would lead us to believe that we continue with constant speed into the intersection before new measurements pull the extrapolated locations back to our true location.

Finally, the Chicken Robot needs your help to find a new name.
  • . We could use the first measurement as the first estimate or we could simply make a guess for the estimate and choose a large uncertainty.

  • Each time a new measurement

  • .

  • Update our estimate

  • .

  • Update our uncertainty

  • .

  • As new measurements become available, repeat the following steps:

    • Form the extrapolated state and covariance:

  •  

    Comments

    Popular posts from this blog

    The Difference Between LEGO MINDSTORMS EV3 Home Edition (#31313) and LEGO MINDSTORMS Education EV3 (#45544)

    http://robotsquare.com/2013/11/25/difference-between-ev3-home-edition-and-education-ev3/ This article covers the difference between the LEGO MINDSTORMS EV3 Home Edition and LEGO MINDSTORMS Education EV3 products. Other articles in the ‘difference between’ series: * The difference and compatibility between EV3 and NXT ( link ) * The difference between NXT Home Edition and NXT Education products ( link ) One robotics platform, two targets The LEGO MINDSTORMS EV3 robotics platform has been developed for two different target audiences. We have home users (children and hobbyists) and educational users (students and teachers). LEGO has designed a base set for each group, as well as several add on sets. There isn’t a clear line between home users and educational users, though. It’s fine to use the Education set at home, and it’s fine to use the Home Edition set at school. This article aims to clarify the differences between the two product lines so you can decide which...

    Let’s ban PowerPoint in lectures – it makes students more stupid and professors more boring

    https://theconversation.com/lets-ban-powerpoint-in-lectures-it-makes-students-more-stupid-and-professors-more-boring-36183 Reading bullet points off a screen doesn't teach anyone anything. Author Bent Meier Sørensen Professor in Philosophy and Business at Copenhagen Business School Disclosure Statement Bent Meier Sørensen does not work for, consult to, own shares in or receive funding from any company or organisation that would benefit from this article, and has no relevant affiliations. The Conversation is funded by CSIRO, Melbourne, Monash, RMIT, UTS, UWA, ACU, ANU, ASB, Baker IDI, Canberra, CDU, Curtin, Deakin, ECU, Flinders, Griffith, the Harry Perkins Institute, JCU, La Trobe, Massey, Murdoch, Newcastle, UQ, QUT, SAHMRI, Swinburne, Sydney, UNDA, UNE, UniSA, UNSW, USC, USQ, UTAS, UWS, VU and Wollongong. ...

    Logic Analyzer with STM32 Boards

    https://sysprogs.com/w/how-we-turned-8-popular-stm32-boards-into-powerful-logic-analyzers/ How We Turned 8 Popular STM32 Boards into Powerful Logic Analyzers March 23, 2017 Ivan Shcherbakov The idea of making a “soft logic analyzer” that will run on top of popular prototyping boards has been crossing my mind since we first got acquainted with the STM32 Discovery and Nucleo boards. The STM32 GPIO is blazingly fast and the built-in DMA controller looks powerful enough to handle high bandwidths. So having that in mind, we spent several months perfecting both software and firmware side and here is what we got in the end. Capturing the signals The main challenge when using a microcontroller like STM32 as a core of a logic analyzer is dealing with sampling irregularities. Unlike FPGA-based analyzers, the microcontroller has to share the same resources to load instructions from memory, read/write th...