# Experimental Least Squares Derivative Estimator

Numerical Methods for Derivative Estimation

The idea suggested at the end of the previous section was to try fitting a polynomial model to the data sequence using a different strategy. To quote notes that previously appeared on this site:

The classical polynomial-based estimators become sensitive to
noise because their approximation curves exactly match every
measurement point, noise and all. A "standard" noise suppression
technique is to *fit a smooth curve* to the data
set, without attempting to match the values exactly. What if this
alternative idea is used to fit a polynomial to the data for purposes
of a derivative approximation?

It was observed that low order polynomial approximations have limited curvature, hence a natural tendency to not track the irrelevant high frequencies. The extra data terms used in the least-squares approximation provide additional degrees of freedom available to resolve and reject noise.

**Spoiler alert.** This idea doesn't fully live up to its promise.
However, it is worth taking a look to see what it does offer.

## Outline of least squares polynomial method

Instead of making the polynomials conform to the data, the data are made to conform to the polynomials. That is, polynomials can be considered to consist of a combination of simple one term polynomials:

1 x x^{2}x^{3}x^{4}x^{5}etc.

These polynomials can be evaluated over a number of
steps of length `h`

. For the case of normalized steps,
the value of `h`

is set to 1. Given the desired length of
the estimator filter (typically 9 to 21), each basic polynomial can
be evaluated at each of the step locations. The polynomial *basis
vectors* obtained in this way can then be collected as the columns
of a matrix. So, for example, points along the polynomial curves of
order 1 through 5 can be evaluated at 11 points centered on location
0, establishing the following matrix.

Matrix MMat = 1 -5 25 -125 625 -3125 1 -4 16 -64 256 -1024 1 -3 9 -27 81 -243 1 -2 4 -8 16 -32 1 -1 1 -1 1 -1 1 0 0 0 0 0 1 1 1 1 1 1 1 2 4 8 16 32 1 3 9 27 81 243 1 4 16 64 256 1024 1 5 25 125 625 3125

Given a vector of function values `y`

to be matched,
in the ideal situation where the underlying function really was
a polynomial of sufficiently low order, it would be possible to
select the polynomial coefficients `f`

to produce
a perfect match to the sequence of data.

MMat · f = y

However, in general, the data are not guaranteed to be
produced by a low order polynomial curve, and a perfect
solution does not exist. The best that can be done is to
identify a "best fit" solution `f`

that matches
the constraint conditions approximately.

MMat · f ≈ y

The necessary conditions for existence of the *least squares
best fit* solution are

f = (MMat' MMat)^{-1}MMat' y

So, now knowing the coefficients for the polynomial with
the best fit, the derivative polynomial for this best fit can also be
determined. When this polynomial is evaluated at `x=0`

,
all of the terms containing a factor of `x`

go to
zero and drop out. That reduces the solution to the desired estimator
formula.

This is all intentionally vague, but there is no reason to spend time on the details.

## Why is this a dead end?

Here is an example of the filter produced by a fifth-order polynomial estimator using 13 function value terms.

Derivative filter by least squares polynomial fit - 13 term, order 5-3.306938e-02 9.287330e-02 -4.113534e-05 -1.148739e-01 -1.567976e-01 -1.075689e-01 0.000000 1.075689e-01 1.567976e-01 1.148739e-01 4.113534e-05 -9.287330e-02 3.306938e-02

Initially, this seems promising. Advantages:

- High frequency rejection superior to Central Differences method
- Low band accuracy generally superior to maxflat designs
- Asymptotic accuracy near zero frequency almost as good as maxflat designs
- Response well attenuated in the middle frequencies

However, the noise rejection is not as good as it should be in the high frequency range. There is really nothing that can be done about this. The least-squares data fitting process has only two adjustable parameters, the length of the desired filter and the desired order of the polynomial approximation. Neither of these provides any direct control over the high-frequency noise exposure. The next section will consider a different approach to the least squares fitting process that allows a certain degree of control.

Footnotes: