One Sided Derivative Estimation

Filter-based estimators

A one sided convolution estimator buffers a history of past function values as it progresses through time. Clearly, this can only apply to the case of an analysis performed sequentially, not "on demand" at an arbitrary location in the data set. Clearly, buffering past input values is not the only way of collecting past history. Perhaps it is more effective to extract and maintain information in some kind of distilled form.

There are some well-known "digital filter" formulas that do this kind of thing for the purpose of rejecting high frequencies from an input data sequence. Perhaps that kind of "stateful" processing can be applied both to derivative estimation and high frequency filtering at the same time, producing a better estimator.

This seems like such an obvious good idea that it is worth examining how this idea fails.


How can filtering help?

There are two ways that additional filtering could help the derivative estimator do its work.

  1. Help the detect and reject constant or nearly-constant patterns in the data.
  2. Help to reject high-frequency chatter.


Differencing filter

It isn't possible to construct central differences — at least not without introducing extra delays. However, it is easy to construct a sequence of single differences. This loses no information (except for the initial constant level, to which the derivative estimator should not respond anyway.) The main advantage is that difference values go asymptotically to zero as function's derivative goes to zero. [1]

Here is a diagram of a filter[2] that can calculate and lowpass-smooth the differences sequence.

One-sided estimator state filter

The D notation indicates that the value propagates through storage in one time interval.

  • Differencing.
    Subtracting a delayed input value from the current new value produces a difference value, df, which is then available for subsequent processing.

  • Lowpass filtering.
    The difference values propagate through a number of delays. These delayed values are then available to be combined by filter processing for purposes of removing extraneous high frequencies. The b coefficients, applied to the history of differences, contribute to the filtered outputs. Those output are processed through a second chain of delays, so that prior filtered outputs help to predict future output values, using the a coefficients. The lowpass filtering constitutes the majority of the processing shown in the diagram.

  • Derivative estimation.
    Given the history of filtered, noise-reduced differences, a best combination of those values using the c coefficients estimates the derivative value at the point of the latest new input.


Obtaining estimator parameters

I will skip the details of trying to design this filter. The lowpass filter is based on a canonical Butterworth filter, and the estimator coefficients are obtained by a weighted least-squares best fit process. For reference, here are the coefficients for the filter.

  b coefficients ---   1.282581e-03   6.412905e-03   1.282581e-02 ...
                       1.282581e-02   6.412905e-03   1.282581e-03 

  a coefficients ---   2.9754221     -3.8060181      2.5452529   ...
                      -0.8811301      0.1254306

  c coefficients ---   5.085767,     -5.896902,      1.582392

The resulting filter was applied to a swept sine waveform sequence. The input sequence looks like the following.

Input data sequence

This is not a frequency spectrum; it is a time sequence going from left to right. However, the frequency is arranged to go increasingly higher as time progresses.

Here is a plot comparing the estimated derivative values (in blue) to the analytically-derived "true" derivative values (in green). The position "200" along the horizontal axis corresponds to 20% of the Nyquist limit.

Ideal and actual estimator sequence


This isn't very good, is it.

The one good feature is that high frequencies are nicely attenuated, the higher the frequency the better the attenuation.

Every other feature of this response is disappointing. The accuracy at low frequencies is poor. The response in the middle-band frequencies is pretty well bounded but not much of an improvement over other one-sided estimator formulas. There is a troublesome amount of phase shift even at relatively low frequencies.

It is the phase shift problem that really cause the damage. It impairs the accuracy at lower frequencies, and it causes reinforcement rather than cancellation of undesirable responses in the middle frequencies.

In short, despite the more complicated processing, this approach provides no apparent advantages. As plausible as it seems, it should be avoided.

However, there is another kind of filtering that directly addresses delays, and the next section will cover that topic.



[1] If you compare to the lowest-order Central Differences Estimator formula (see the page CentralDifferences.html on this site), you can observe that a difference value is in itself an approximator (though a relatively inaccurate and noisy one) of the function derivative, but delayed by one half time interval.

[2] This is one of the common "canonical" filter architectures. You can read more in the Wikipedia article Digital Filter, in the Direct Form 1 sub-section.


Contents of the "Numerical Estimation of Derivatives" section of the website, including this page, are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise specifically noted. For complete information about the terms of this license, see The license allows usage and derivative works for individual or commercial purposes under certain restrictions. For permissions beyond the scope of this license, see the contact information page for this site.