ARMA Models

The model fit obtained in the previous installment tries to determine a linear relationship between past input and output data to predict future outputs. It showed some distinct promise. But it is not finished. There are two important issues to resolve.

The dynamic state transition model used by the Kalman filter uses only current input values and state from the previous step. The Least Squares model fit used data from several past steps. We must resolve this structural inconsistency.
When the model is applied, it does not have the advantage of knowing what the "real" system output is, the way that the Least Squares fitting process knows. The most the model can know about is the past values of its predictions of output variables. Substituting predicted values for real values is always tricky. But is it merely tricky, or disastrous?

This installment addresses the first of these challenges.

ARMA models

In the field of time series analysis, the sort of model where the next output projection is based on a set of past inputs and outputs is known as an ARMA model. ARMA stands for Autoregressive Moving Average.

The autoregressive part means "the next output is determined from a weighted sum of some number of prior outputs."
The moving average part means "the next output is determined from a weighted sum of some number of prior inputs."

Put them together and you get...

y_{1} = \overset{0}{\sum_{i = - M}} a_{i} x_{i} + \overset{0}{\sum_{i = - N}} b_{i} y_{i}

This has little resemblance to a Kalman Filter state transition model. So the immediate task is to reorganize the model into a state transition form.

Shift Register construct

A special and artificial kind of state transition matrix called a "shift register" looks something like the following.

A = (\begin{matrix} 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \end{matrix})

Notice how the 1.0 values appear on the first sub-diagonal line and nowhere else. For the model fit determined in the previous installment, there are five values of output history required to calculate the next state predictions, so in this particular case, the matrix has five rows and five columns.

Let's treat this like a regular dynamic state transition matrix and see what happens. We will define a shift register vector v that contains a list of past values of the output variable y. We transform vector v by multiplying with the special shift register matrix A.

  v¹ = A v⁰

If you look at the details of the matrix multiply operation, you can see that it has the effect of shifting the terms in the previous v vector by one position downward, leaving the first location vacant. The previous final value in the list drops away. We just need a way to inject the latest new data into the list at the first position.

Define a coupling matrix B having the following structure.

B = (\begin{matrix} 1 \\ 0 \\ 0 \\ 0 \\ 0 \end{matrix})

Let the new input value u be the latest new output term to be introduced. Add the additional product term B · y⁰ into the previous shift register relationship. This injects the new value into the vacant first location without affecting any other terms.

  v¹ = A v⁰ + B y⁰

The result is an artificial "state transition equation" representation for the shift register. It is not very compact, but it has the right form for dynamic state transition equations.

That takes care of past outputs.

A similar but separate shift register construct is needed to manage the history of input values. There is one small difference. Since this particular model uses one current input terms plus 4 more terms from past history, the shift register only needs to maintain the 4 past input terms.

The next few steps finish the model:

combine the two shift registers to form a single matrix,
introduce the newest terms in the appropriate locations,
incorporate the model calculations for the next output.

Unifying the shift registers

Let A_y and A_u be shift register matrices of size 5x5 and 4x4 for the output history and input history data, respectively. Let B_y and B_u be input coupling matrices with one nonzero term each, as described above, for introducing new output observation and input terms into the front of the respective shift registers. Then stack up copies of these two matrices, and fill the rest of the matrix terms with zeroes, to build up a 9x9 matrix in the following manner.

A = (\begin{matrix} Ay & 0 \\ 0 & Au \end{matrix})

B = (\begin{matrix} By & 0 \\ 0 & Bu \end{matrix})

u = (\begin{matrix} y_{0} \\ u_{0} \end{matrix})

Now, matrix operations can be used at each time step to update the history information. The most current history information is stored in the state vector v.

   v  =  A v  +  B u

The next adjustment introduces the model fit parameter values obtained in the previous installment. The new value of the output variable does not come in from some kind of external input, as represented currently. Instead, this prediction must be calculated using the existing elements of the history vector v. Last time, this was the best fit formula we derived:

 
  y^N+1 = [y^N+0   y^N-1   y^N-2   y^N-3   y^N-4   u^N+0   u^N-1   u^N-2   u^N-3   u^N-4   ] · 
        [ b₀    b_-1    b_-2    b_-3    b_-4    a₀    a_-1    a_-2    a_-3    a_-4 ]^T

   0.9000531       Multipliers 'b' for output history terms
   0.0698574
   0.0249757
  -0.0734740
   0.0148904
  -0.3757921       Multipliers 'a' for input history terms
  -0.1939514
  -0.0527318
   0.0051917
   0.0230597

To apply the model parameters, make the following adjustments to the shift register equations.

In the first row of the composite shift register matrix, replace the first five terms (all zero) with the 'b' parameter values from the model, associated with prior output history terms.
In the first term of the second column of the B coupling matrix, store the 'a' parameter term corresponding to the most recent input value.
The output variable is no longer used as a direct input. Remove the first term from the u input vector, and remove the first column of the composite B matrix, leaving only one column in that matrix.
In the first row of the composite shift register matrix, replace the four final zero terms with the remaining 4 model parameter values associated with past input history.
Set all terms in the matrix column 6 except for the first term to zero values, since the first input terms is no longer present.

Listing 1 shows Octave code to set up the dynamic model.

Listing 1

% Construct the state transition matrix model
AMat = zeros(9,9);
BMat = zeros(9,1);
CMat = zeros(1,9);

% Output history shift
AMat(2,1) = 1.0;
AMat(3,2) = 1.0;
AMat(4,3) = 1.0;
AMat(5,4) = 1.0;

% Input history shift
AMat(7,6) = 1.0;
AMat(8,7) = 1.0;
AMat(9,8) = 1.0;

% Model terms for next output prediction
AMat(1,1) = 0.9000531;
AMat(1,2) = 0.0698574;
AMat(1,3) = 0.0249757;
AMat(1,4) = -0.0734740;
AMat(1,5) = 0.014890;
AMat(1,6) = -0.1939514;
AMat(1,7) = -0.0527318;
AMat(1,8) =  0.0051917;
AMat(1,9) = 0.0230597

% Current input feeds next output prediction and input shift
BMat = [ -0.3757921, 0, 0, 0, 0, 1.0, 0, 0, 0 ]'

% Next output prediction is the first state variable
CMat(1)=1.0

In listing 2, verify that Octave has received the correct matrix configuration.

Listing 2

AMat =
 Columns 1 through 5:
   0.90005   0.06986   0.02498  -0.07347   0.01489  
   1.00000   0.00000   0.00000   0.00000   0.00000   
   0.00000   1.00000   0.00000   0.00000   0.00000  
   0.00000   0.00000   1.00000   0.00000   0.00000   
   0.00000   0.00000   0.00000   1.00000   0.00000   
   0.00000   0.00000   0.00000   0.00000   0.00000  
   0.00000   0.00000   0.00000   0.00000   0.00000   
   0.00000   0.00000   0.00000   0.00000   0.00000  
   0.00000   0.00000   0.00000   0.00000   0.00000  

 Columns 6 through  9:
  -0.19395  -0.05273   0.00519   0.02306
   0.00000   0.00000   0.00000   0.00000
   0.00000   0.00000   0.00000   0.00000
   0.00000   0.00000   0.00000   0.00000
   0.00000   0.00000   0.00000   0.00000
   0.00000   0.00000   0.00000   0.00000
   1.00000   0.00000   0.00000   0.00000
   0.00000   1.00000   0.00000   0.00000
   0.00000   0.00000   1.00000   0.00000

BMat =
  -0.37579
   0.00000
   0.00000
   0.00000
   0.00000
   1.00000
   0.00000
   0.00000
   0.00000

CMat =
   1   0   0   0   0   0   0   0   0

Now it is left as an exercise for you to verify that the following things will occur when a new input value is provided and the matrix multiply operation generates the next state.

The previous output history values are shifted down one position, dropping the last value out.
The new value calculated for the first shift register state variable is the predicted output value for the next time step.
The previous input history values are shifted down one position, all except for the last, which is dropped.
The newest input value, in addition to being used for the next output state calculations, is copied into the empty location at the front of the list of input history terms.
The new value calculated for the first shift register state variable is the predicted output value for the next time step.

The shift register formulation is a bit artificial, but at least it has the right kind of state transition form to be used for Kalman Filter purposes. The only thing missing is testing whether it does anything useful. That's where we will continue next time.

Converting ARMA Models

ARMA models

Shift Register construct

Unifying the shift registers