One Sided Derivative Estimation

Estimation using ad-hoc smoothers

In the previous section, derivative estimation strategies using classic "stateful" frequency-selective filters were found to fail, primarily because of excessive "phase shifts" that led to poor accuracy.

There are other sorts of filters that do not suffer from the excessive delays. Notable among these is the classic Kalman Filter. However, to set up a Kalman Filter you must already know the derivatives. [1] After that, calculating derivative estimates would be pointless.

There is another type of filtering that does not require developing a detailed state equation model or having advance knowledge of the derivative values. It could be described as "cheap rather than good." However, plenty of set-up effort is still required. That means it is applicable only under the following conditions:

  1. As with other "stateful" filtering options, it is feasible to process the data in sequential order.
  2. There must be enough subsequent opportunity to apply the analysis so that all of the initial effort is justified.
  3. Performance guarantees of "it might be good enough" are sufficient.


The filtering strategy: first cut (not too smart)

We will think of the model as a mathematical filtering operation that takes in the sequence of observed function values, contaminated with noise, and generates as an output a corresponding sequence of values that match the original input sequence except with the random noise removed. That purpose is only approximately achievable even under ideal conditions.

We can think of the sequence of noisy input values as being a time sequence. Even if the data do not arrive from a true time-dependent process, the processing is done one value at a time as if at regular time intervals.

Let us start with a highly oversimplified dynamic model for the filter. This first model says "the dynamic process that generates the data sequence is constant." The internal position state of the model determines the output value that the filter will generate.

    position prediction = model's position state 

The severely limited capability of this dynamic model (lacking any dynamics) should be obvious. The model does not have any way to know which output level to predict before observing some actual data. After seeing an input value, it will be possible to observe the discrepancy between the constant model predictions and the input value. This comparison provides information about what the prediction should have been. Adjusting the supposedly-constant "position" state of the dynamic model can make it more like the input value, less like the previous prediction, and this should result in a better prediction during the next step. However, because the input data sequence is noisy, no individual input value conclusively determines what the model's true internal "position" state should be.

Let the discrepancy between the model's prediction and the actual data sequence be

    Δ == (input function value) - (model prediction) 

We can correct the model's discrepancy by adjusting the internal model state as follows:

    revised prediction = (previous position state) + α · Δ  

where the α parameter is to be determined. If a parameter value of 0.0 is chosen, the model remains oblivious to what the data sequence is doing, which is useless. If a parameter value of 1.0 is chosen, the discrepancy is corrected completely, but all of the noise of the input data is regenerated as well, so this accomplishes no useful purpose.

A positive parameter value between 0.0 and 1.0 provides a partial correction. Several cycles of inputs and updates are necessary for the model predictions to arrive close to the correct values, but the noise disturbances are significantly diminished, because the outputs depend much less on each individual value and much more on the accumulated state information. This is the tradeoff: how close you want to track the data sequence, versus how much noise to you want to tolerate.


A more reasonable dynamic model

A "dynamic model" with a constant output isn't particularly relevant to the case of estimating non-zero derivatives. So the model needs to expand to account for input sequences that change.

In a second version of the dynamic model, assume that the output sequence produced by the filter model is a combination of a constant "current position" state plus an adjustment coming from a second model state that we can describe as velocity, predicting that position changes are approximately constant over time. For this new model, the output prediction is:

    constant model state <== constant model state + velocity model state
    initial predicted output = constant model state 

In other words, the value of the "constant" model state is incrementally adjusted before each new prediction is generated.

But this is also insufficient. The actual data sequence is highly unlikely to change at the same constant rate forever. Once again there will be a discrepancy between the model predictions and the actual data sequence. Tracking can be improved by adjusting both the "position" and the "velocity" states of the model.

    new velocity state = (previous velocity state) + β · Δ 
    new position state = (previous position state) + α · Δ
    adjusted output = new position state 

where now both α and β are parameters to be determined.

The dynamic system should now track the input data sequence much better, but adjustments will still be somewhat noisy.


Finally, a somewhat reasonable dynamic model

The same idea can be applied again, to allow for a derivative that is expected to change, introducing a third dynamic model scheme with a new acceleration state. The model's "velocity" and "position" states are both incrementally adjusted before predicting the next output value.

    velocity model state <== velocity model state + acceleration model state
    position model state <== position model state + velocity model state
    initial predicted value = position model state

Then, the state values are adjusted incrementally according to the amount of prediction error observed.

    Δ <== (initial predicted value) - (model prediction) 
    new position state = (previous position state) + α · Δ
    new velocity state = (previous velocity state) + β · Δ
    new acceleration state = (previous acceleration state) + γ · Δ
    adjusted output = new position state 

This filtering strategy is called Alpha-Beta-Gamma filtering,[2] named for the three artificial parameter values to be determined in the dynamic model.


How do you determine model parameters?

Well, there's the rub. There is no process guaranteed to lead you efficiently to a set of "best parameter values." And there are no rules about what "best" means, except for what you like, and that is highly subjective.

The best tuning strategy is "cut and try," since you really don't have any other options. Set the alpha, beta, and gamma variables all equal to zero. Start with the following guesses for the filter's three internal state variables.

  1. Position state: set equal to the first value in the input data set.
  2. Velocity state: set equal to 1/4 of the net value change across the first five values of the input data set.
  3. Acceleration state: set to zero.

Now pump your input data stream through the A-B-G filter equations and record the sequence of output predictions. Here is an example of what the processing script might look like.

% Calculate smoothed function and derivative estimate values
for k = 1:nstep
  % Initial projection by model
  velocity = velocity + accel;
  position = position + velocity;

  % State correction by A-B-G strategy
  delta = funcvalue(k)-position;
  accel = accel + pgamma*delta;
  velocity = velocity + pbeta*delta;
  position = position + palpha*delta;

  % Save history of position and velocity values
  preval(k) = position;
  prederv(k)= velocity;

Make a plot superimposing the predicted values on top of the noisy input values. Initially, there is almost no relationship.

Now, start adjusting the parameter in small increments, one at a time. Look for the output sequence to become more and more like the input sequence except with obvious noise artifacts removed. After each adjustment, run your data sequence through the A-B-G filter again. You might try adjustments of 0.02 for the α parameter, adjustments of 0.01 for the β parameter, adjustments of 0.002 for the γ parameter. When a change seems to be helpful, try the same adjustment again. Plot the sequence of "position" and "velocity" estimates to verify that they are not too noisy. Continue until you find a parameter set that you think works the best.

You will need to "tune" your parameter set in this manner for each new class of curves that you need to track.


And the reason for all of this is...?

If you are able to find a dynamic model that adequately tracks the original input data sequence you need to analyze, and is suitably insensitivity to noise, then the sequence of velocity state values within the dynamic model is the corresponding sequence of estimates for the derivative of the input sequence.

This is a computationally very efficient derivative estimator, but with no guarantees about how good or bad your results are going to be.


Let's try it

This idea is now applied to the same sequence of noisy data used for experiments with central derivative estimators. The following parameters were selected for the alpha-beta-gamma filter. You might prefer a different tradeoff between accuracy and noise rejection.

  palpha =  0.20000
  pbeta  =  0.16000
  pgamma =  0.014000

Using this set of parameters, here was the tracking of the noisy input data sequence. The blue curve is filtered values, while green is the noise-free mathematically ideal values.

noisy original and smoothed data sequences

And now, by extracting the sequence of internal "velocity" state values from the smoothing filter, here is the sequence of derivative estimates (shown in blue), compared to "mathematically exact" derivative values uncorrupted by noise (shown in red).

estimated and ideal derivative sequences

You can see an initial transient response from the filter. Several initial updates are required for this to settle out. After that, there is very little phase error in the tracking. Tracking is not particularly good at points of rapid change in the derivative value, indicating that the response bandwidth is limited. If these problems are not onerous, and the derivative estimates seem acceptable to you, you will obtain results with minimal computation and minimal delay.



[1] For background information about Kalman Filters, you can study the Wikipedia article Kalman Filters, or explore Developing Models for Kalman Filters on this site.

[2] For more information about alpha-beta and alpha-beta-gamma filters, check Wikipedia article Alpha Beta Filter, or explore Minimalist Observer: the Alpha Beta Filter on this site.


Contents of the "Numerical Estimation of Derivatives" section of the website, including this page, are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise specifically noted. For complete information about the terms of this license, see The license allows usage and derivative works for individual or commercial purposes under certain restrictions. For permissions beyond the scope of this license, see the contact information page for this site.