Converting ARMA Models
Developing Models for Kalman Filters
The model fit obtained in the previous installment tries to determine a linear relationship between past input and output data to predict future outputs. It showed some distinct promise. But it is not finished. There are two important issues to resolve.
- The dynamic state transition model used by the Kalman filter uses only current input values and state from the previous step. The Least Squares model fit used data from several past steps. We must resolve this structural inconsistency. 
- When the model is applied, it does not have the advantage of knowing what the "real" system output is, the way that the Least Squares fitting process knows. The most the model can know about is the past values of its predictions of output variables. Substituting predicted values for real values is always tricky. But is it merely tricky, or disastrous? 
This installment addresses the first of these challenges.
ARMA models
In the field of time series analysis, the sort of model where the next output projection is based on a set of past inputs and outputs is known as an ARMA model. ARMA stands for Autoregressive Moving Average.
- The autoregressive part means "the next output is determined from a weighted sum of some number of prior outputs."
- The moving average part means "the next output is determined from a weighted sum of some number of prior inputs."
Put them together and you get...
This has little resemblance to a Kalman Filter state transition model. So the immediate task is to reorganize the model into a state transition form.
Shift Register construct
A special and artificial kind of state transition matrix called a "shift register" looks something like the following.
 Notice how the 1.0 values appear on the first 
sub-diagonal line and nowhere else. For the model fit determined
in the previous installment, there are five values of output history
required to calculate the next state predictions, so in this particular
case, the matrix has five rows and five columns. 
 Let's treat this like a regular dynamic state
transition matrix and see what happens. We will define a shift register
vector v that contains a list of past values of the 
output variable y. We transform vector v
by multiplying with the special shift register matrix  A. 
  v1 = A v0  If you look at the details of the matrix multiply operation,
you can see that it has the effect of shifting the terms in the previous 
v vector by one position downward, leaving the first 
location vacant. The previous final value in the list drops away.
We just need a way to inject the latest new data into the 
list at the first position.
 Define a coupling matrix B having the following structure.  
 Let the new input value u be the latest 
new output term to be introduced. Add the additional product
term  B · y0  into the previous shift 
register relationship. This injects the new value into the vacant 
first location without affecting any other terms. 
  v1 = A v0 + B y0 The result is an artificial "state transition equation" representation for the shift register. It is not very compact, but it has the right form for dynamic state transition equations.
That takes care of past outputs.
A similar but separate shift register construct is needed to manage the history of input values. There is one small difference. Since this particular model uses one current input terms plus 4 more terms from past history, the shift register only needs to maintain the 4 past input terms.
The next few steps finish the model:
- combine the two shift registers to form a single matrix,
- introduce the newest terms in the appropriate locations,
- incorporate the model calculations for the next output.
Unifying the shift registers
 Let Ay and  Au be shift 
register matrices of size 5x5 and 4x4 for the output history and input
history data, respectively. Let By 
and Bu be input coupling matrices with one nonzero 
term each, as described above, for introducing new output observation and
input terms into the front of the respective shift registers. Then stack up 
copies of these two matrices, and fill the rest of the matrix terms with 
zeroes, to build up a 9x9 matrix in the following manner. 
 Now, matrix operations can be used at each time step to update the history
information. The most current history information is stored in the state vector
v. 
v = A v + B u
 The next adjustment introduces the model fit parameter values obtained in
the previous installment. The new value of the output variable does not come
in from some kind of external input, as represented currently. Instead, this
prediction must be calculated using the existing elements of the history
vector v. Last time, this was the best fit formula we derived: 
 
  yN+1 = [yN+0   yN-1   yN-2   yN-3   yN-4   uN+0   uN-1   uN-2   uN-3   uN-4   ] · 
        [ b0    b-1    b-2    b-3    b-4    a0    a-1    a-2    a-3    a-4 ]T
0.9000531 Multipliers 'b' for output history terms 0.0698574 0.0249757 -0.0734740 0.0148904 -0.3757921 Multipliers 'a' for input history terms -0.1939514 -0.0527318 0.0051917 0.0230597
To apply the model parameters, make the following adjustments to the shift register equations.
-  In the first row of the composite shift register matrix,
replace the first five terms (all zero) with the 'b' parameter values from the model, associated with prior output history terms.
-  In the first term of the second column of the Bcoupling matrix, store the 'a' parameter term corresponding to the most recent input value.
-  The output variable is no longer used as a direct input.
Remove the first term from the uinput vector, and remove the first column of the composite B matrix, leaving only one column in that matrix.
- In the first row of the composite shift register matrix, replace the four final zero terms with the remaining 4 model parameter values associated with past input history.
- Set all terms in the matrix column 6 except for the first term to zero values, since the first input terms is no longer present.
Listing 1 shows Octave code to set up the dynamic model.
Listing 1
% Construct the state transition matrix model AMat = zeros(9,9); BMat = zeros(9,1); CMat = zeros(1,9); % Output history shift AMat(2,1) = 1.0; AMat(3,2) = 1.0; AMat(4,3) = 1.0; AMat(5,4) = 1.0; % Input history shift AMat(7,6) = 1.0; AMat(8,7) = 1.0; AMat(9,8) = 1.0; % Model terms for next output prediction AMat(1,1) = 0.9000531; AMat(1,2) = 0.0698574; AMat(1,3) = 0.0249757; AMat(1,4) = -0.0734740; AMat(1,5) = 0.014890; AMat(1,6) = -0.1939514; AMat(1,7) = -0.0527318; AMat(1,8) = 0.0051917; AMat(1,9) = 0.0230597 % Current input feeds next output prediction and input shift BMat = [ -0.3757921, 0, 0, 0, 0, 1.0, 0, 0, 0 ]' % Next output prediction is the first state variable CMat(1)=1.0
In listing 2, verify that Octave has received the correct matrix configuration.
Listing 2
AMat = Columns 1 through 5: 0.90005 0.06986 0.02498 -0.07347 0.01489 1.00000 0.00000 0.00000 0.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 Columns 6 through 9: -0.19395 -0.05273 0.00519 0.02306 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 0.00000 1.00000 0.00000 BMat = -0.37579 0.00000 0.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 CMat = 1 0 0 0 0 0 0 0 0
Now it is left as an exercise for you to verify that the following things will occur when a new input value is provided and the matrix multiply operation generates the next state.
- The previous output history values are shifted down one position, dropping the last value out.
- The new value calculated for the first shift register state variable is the predicted output value for the next time step.
- The previous input history values are shifted down one position, all except for the last, which is dropped.
- The newest input value, in addition to being used for the next output state calculations, is copied into the empty location at the front of the list of input history terms.
- The new value calculated for the first shift register state variable is the predicted output value for the next time step.
The shift register formulation is a bit artificial, but at least it has the right kind of state transition form to be used for Kalman Filter purposes. The only thing missing is testing whether it does anything useful. That's where we will continue next time.