Say It With Me: Aliasing

Suppose you take a few measurements of a time-varying signal. Let’s say for concreteness that you have a microcontroller that reads some voltage 100 times per second. Collecting a bunch of data points together, you plot them out — this must surely have come from a sine wave at 35 Hz, you say. Just connect up the dots with a sine wave! It’s as plain as the nose on your face.

And then some spoil-sport comes along and draws in a version of your sine wave at -65 Hz, and then another at 135 Hz. And then more at -165 Hz and 235 Hz or -265 Hz and 335 Hz. And then an arbitrary number of potential sine waves that fit the very same data, all spaced apart at positive and negative integer multiples of your 100 Hz sampling frequency. Soon, your very pretty picture is looking a bit more complicated than you’d bargained for, and you have no idea which of these frequencies generated your data. It seems hopeless! You go home in tears.

But then you realize that this phenomenon gives you super powers — the power to resolve frequencies that are significantly higher than your sampling frequency. Just as the 235 Hz wave leaves an apparent 35 Hz waveform in the data when sampled at 100 Hz, a 237 Hz signal will look like 37 Hz. You can tell them apart even though they’re well beyond your ability to sample that fast. You’re pulling in information from beyond the Nyquist limit!

This essential ambiguity in sampling — that all frequencies offset by an integer multiple of the sampling frequency produce the same data — is called “aliasing”. And understanding aliasing is the first step toward really understanding sampling, and that’s the first step into the big wide world of digital signal processing.

Whether aliasing corrupts your pristine data or provides you with super powers hinges on your understanding of the effect, and maybe some judicious pre-sampling filtering, so let’s get some knowledge.

Aliasing

In some sense, aliasing is all in your mind. When you took your 100 Hz samples, you were probably looking for some relatively low frequency. Otherwise you would have sampled faster, right? But how is Mother Nature supposed to know which frequencies you want to measure? She just hands you the instantaneous sum of all voltage signals at all frequencies from DC to daylight, and leaves you to sort it out.

If you add together 35 Hz and 135 Hz waveforms, the resulting analog waveform will have twice the amplitude at the sampling points that correspond to 100 Hz. And when you sample, you get exactly the right value. Why did you expect otherwise?

The why is because when you think of the sampled values, you’re fooling yourself into thinking that you’ve seen the whole picture rather than just a few tiny points in time, with no data in between. But in principle, anything can be happening to the signal between samples. We just choose to use the simplest (lowest frequency) interpretation that will fit.

Not only does this seem reasonable, but this is also deeply ingrained in human physiology, so there’s no use fighting it. When you watch a Western, and the stagecoach accelerates so that the spokes in its wheels just match the film’s frame rate, you see the wheels as stopped because each frame was taken when the next spoke was in the same position as in the previous frame. As it speeds up further, you even think you see the wheel turning backwards. The illusion that sampled data comes from the “obvious” underlying signal is strong.