# One Sided Derivative Estimation

## Estimation using ad-hoc smoothers

In the previous section, derivative estimation strategies using classic "stateful" frequency-selective filters were found to fail, primarily because of excessive "phase shifts" that led to poor accuracy.

There are other sorts of filters that do not suffer from the
excessive delays. Notable among these is the classic Kalman Filter.
However, to set up a Kalman Filter you must already know the derivatives.
^{[1]} After that, calculating derivative
estimates would be pointless.

There is another type of filtering that does not require developing a detailed state equation model or having advance knowledge of the derivative values. It could be described as "cheap rather than good." However, plenty of set-up effort is still required. That means it is applicable only under the following conditions:

- As with other "stateful" filtering options, it is feasible to process the data in sequential order.
- There must be enough subsequent opportunity to apply the analysis so that all of the initial effort is justified.
- Performance guarantees of "it might be good enough" are sufficient.

## The filtering strategy: first cut (not too smart)

We will think of *the model* as a mathematical filtering operation
that takes in the sequence of observed function values, contaminated with
noise, and generates as an output a corresponding sequence of values that
match the original input sequence except with the random noise removed.
That purpose is only approximately achievable even under
ideal conditions.

We can think of the sequence of noisy *input* values
as being a *time sequence*. Even if the data do not arrive from
a true time-dependent process, the processing is done one value at
a time as if at regular time intervals.

Let us start with a highly oversimplified dynamic model for the filter.
This first model says "the dynamic process that generates the data sequence
is constant." The internal *position state* of the model determines
the output value that the filter will generate.

position prediction = model's position state

The severely limited capability of this dynamic model (lacking any
dynamics) should be obvious. The model does not have any way to know
**which** output level to predict before observing some actual data.
After seeing an input value, it will be possible to observe the
discrepancy between the constant model predictions and the input value.
This comparison provides information about what the prediction
*should have been*. Adjusting the supposedly-constant
"position" state of the dynamic model can make it more like the input
value, less like the previous prediction, and this should result in a
better prediction during the next step. However, because the input
data sequence is noisy, no individual input value *conclusively*
determines what the model's true internal "position" state should be.

Let the discrepancy between the model's prediction and the actual data sequence be

Δ == (input function value) - (model prediction)

We can correct the model's discrepancy by adjusting the internal model state as follows:

revised prediction = (previous position state) + α · Δ

where the α parameter is *to be determined*. If a parameter
value of `0.0`

is chosen, the model remains oblivious to
what the data sequence is doing, which is useless. If a parameter value of
`1.0`

is chosen, the discrepancy is corrected completely, but
all of the noise of the input data is regenerated as well, so this
accomplishes no useful purpose.

A positive parameter value between `0.0`

and `1.0`

provides a partial correction. Several cycles of inputs and updates
are necessary for the model predictions to arrive close to the correct
values, but the noise disturbances are significantly diminished, because
the outputs depend much less on each individual value and much more on
the accumulated state information. This is the tradeoff: how close
you want to track the data sequence, versus how much noise to you want
to tolerate.

## A more reasonable dynamic model

A "dynamic model" with a constant output isn't particularly relevant to the case of estimating non-zero derivatives. So the model needs to expand to account for input sequences that change.

In a second version of the dynamic model, assume
that the output sequence produced by the filter model is a combination
of a constant "current position" state **plus** an adjustment coming
from a second model state that we can describe as *velocity*,
predicting that position changes are approximately constant over time.
For this new model, the output prediction is:

constant model state <== constant model state + velocity model state initial predicted output = constant model state

In other words, the value of the "constant" model state is incrementally adjusted before each new prediction is generated.

But this is also insufficient. The actual data sequence is highly unlikely to change at the same constant rate forever. Once again there will be a discrepancy between the model predictions and the actual data sequence. Tracking can be improved by adjusting both the "position" and the "velocity" states of the model.

new velocity state = (previous velocity state) + β · Δ new position state = (previous position state) + α · Δ adjusted output = new position state

where now both α and β are parameters *to be determined*.

The dynamic system should now track the input data sequence
*much better*, but adjustments will still be somewhat noisy.

## Finally, a somewhat reasonable dynamic model

The same idea can be applied again, to allow for a derivative that
is expected to change, introducing a third dynamic model
scheme with a new *acceleration state*. The model's
"velocity" and "position" states are both incrementally adjusted before
predicting the next output value.

velocity model state <== velocity model state + acceleration model state position model state <== position model state + velocity model state initial predicted value = position model state

Then, the state values are adjusted incrementally according to the amount of prediction error observed.

Δ <== (initial predicted value) - (model prediction) new position state = (previous position state) + α · Δ new velocity state = (previous velocity state) + β · Δ new acceleration state = (previous acceleration state) + γ · Δ adjusted output = new position state

This filtering strategy is called *Alpha-Beta-Gamma*
filtering,^{[2]} named for the three artificial
parameter values *to be determined* in the dynamic model.

## How do you determine model parameters?

Well, there's the rub. There is no process guaranteed to lead you efficiently to a set of "best parameter values." And there are no rules about what "best" means, except for what you like, and that is highly subjective.

The best tuning strategy is "cut and try," since you really don't have any other options. Set the alpha, beta, and gamma variables all equal to zero. Start with the following guesses for the filter's three internal state variables.

- Position state: set equal to the first value in the input data set.
- Velocity state: set equal to 1/4 of the net value change across the first five values of the input data set.
- Acceleration state: set to zero.

Now pump your input data stream through the A-B-G filter equations and record the sequence of output predictions. Here is an example of what the processing script might look like.

% Calculate smoothed function and derivative estimate values for k = 1:nstep % Initial projection by model velocity = velocity + accel; position = position + velocity; % State correction by A-B-G strategy delta = funcvalue(k)-position; accel = accel + pgamma*delta; velocity = velocity + pbeta*delta; position = position + palpha*delta; % Save history of position and velocity values preval(k) = position; prederv(k)= velocity; end

Make a plot superimposing the predicted values on top of the noisy input values. Initially, there is almost no relationship.

Now, start adjusting the parameter in small increments, one at a time. Look for the output sequence to become more and more like the input sequence except with obvious noise artifacts removed. After each adjustment, run your data sequence through the A-B-G filter again. You might try adjustments of 0.02 for the α parameter, adjustments of 0.01 for the β parameter, adjustments of 0.002 for the γ parameter. When a change seems to be helpful, try the same adjustment again. Plot the sequence of "position" and "velocity" estimates to verify that they are not too noisy. Continue until you find a parameter set that you think works the best.

You will need to "tune" your parameter set in this manner for each new class of curves that you need to track.

## And the reason for all of this is...?

If you are able to find a dynamic model that adequately tracks the
original input data sequence you need to analyze, and is suitably
insensitivity to noise, then the sequence of *velocity state*
values within the dynamic model is the corresponding sequence of
estimates for the derivative of the input sequence.

This is a computationally very efficient derivative estimator, but with no guarantees about how good or bad your results are going to be.

## Let's try it

This idea is now applied to the same sequence of noisy data used for experiments with central derivative estimators. The following parameters were selected for the alpha-beta-gamma filter. You might prefer a different tradeoff between accuracy and noise rejection.

palpha = 0.20000 pbeta = 0.16000 pgamma = 0.014000

Using this set of parameters, here was the tracking of the noisy input data sequence. The blue curve is filtered values, while green is the noise-free mathematically ideal values.

And now, by extracting the sequence of internal "velocity" state values from the smoothing filter, here is the sequence of derivative estimates (shown in blue), compared to "mathematically exact" derivative values uncorrupted by noise (shown in red).

You can see an initial transient response from the filter. Several initial updates are required for this to settle out. After that, there is very little phase error in the tracking. Tracking is not particularly good at points of rapid change in the derivative value, indicating that the response bandwidth is limited. If these problems are not onerous, and the derivative estimates seem acceptable to you, you will obtain results with minimal computation and minimal delay.

Footnotes:

[1] For background information about Kalman Filters, you can study the Wikipedia article Kalman Filters, or explore Developing Models for Kalman Filters on this site.

[2] For more information about alpha-beta and alpha-beta-gamma filters, check Wikipedia article Alpha Beta Filter, or explore Minimalist Observer: the Alpha Beta Filter on this site.