Introduction

The quality of the flood forecasts will, in general, depend on the quality of the simulation model, the accuracy of the precipitation and boundary forecasts, and the efficiency of the data assimilation procedure (Madsen, et al. 2000).
This document describes the AR error module that can be used for output correction.

Role in FEWS

The error modelling module is a generic forecasting module. The module is used to improve the reliability of forecast by attempting to identify the structure of the error a forecasting module makes during the modelling phase where both the simulated and observed values are available, and then applying this structure to the forecast values. This is under the assumption that the structure of the error remains unchanged.

Because of the structure of the forecasting system where models are first run over a historic period and then over a forecast period the error modelling module runs in two phases, i during the historic period where the structure of the error model is determined, and ii during the forecast phase where the error model is applied in correcting the forecast time series.

The module applies an AR model of the error. The order of the statistical model may either be selected by the user (through configuration), or derived automatically. In this second mode the user must indicate the maximum order of each of these parts.

To stabilise the identification of the error model, transformations to the model residuals may be applied before identifying the model;

no transformation
transforming the series by subtracting the mean
Box-Cox transformation, in this case the user must also identify the lambda parameter to be used. A lambda of zero indicates a natural logarithm transformation

Functionality described

This utility is applied to improve model time series predictions through combining modelled series and observed series. It uses as input an output series from a forecasting module (typically discharge from a routing or rainfall-runoff module) and the observed series at the same location. An updated series for the module output is again returned by the module. Updating is applied through application of an error model to the residuals between module output and observed series. This error model is applied also to the forecast data from this module to allow correction of errors in the forecast.

Data Requirements

Input time series data

To apply the error modelling module, time series data are required for both the simulated and historical period at a given location, as well as the forecast time series at this location. Under normal configuration, these time series will be of the same parameter.

Time series	Parameter (example)	View period
Simulated values	Q.simulated.historic	Historic period (e.g. -2000 hours to start of forecast)
Observed values	Q.obs	Historic period (e.g. -2000 hours to start of forecast)
Forecast values	Q.simulated.forecast	Forecast period (e.g. start of forecast to +48-240 hours)*

* Note: The length of the forecast period may be zero. If this is the case, then the error modeling module will consider only the historic period.

Output time series data

The error modelling module returns two time series, an update time series for the historic period, and an updated time series for the forecast period. In principal the updated time series over the historic period is almost identical to the observed time series.

Time series	Parameter (example)	View period
Updated values (historic)	Q.updated.historic	Historic period (e.g. -2000 hours to start of forecast)
Updated values (forecast)	Q.updated.forecast	Forecast period (e.g. start of forecast to +48-240 hours)*

Configuration data

The configuration of the error modelling module is used to determine its behaviour in establishing the statistical model of the error and how this is applied to derive the updated series

Configuration items

Order_AR: (maximum) order of the AR component;
Order_MA: 0;
Order_Sel: Option to determine if the orders are to be derived automatically (with the maxima as defined above) or as given;
Transform&nbsp: Option to apply a transformation to residuals. This may either be "none", "mean" or "boxcox";
Lambda: A required parameter for the "boxcox" transformation option.

Conditions and Assumptions

Below a short summary of the paper by Broersen (2002) can be found. The algorithms were extracted from ARMASA a Matlab Toolbox (Broersen, Online) and are implemented in the Delft-FEWS AR module.

Time series definitions

Three types of time series models can be distinguished, autoregressive or Ar, moving average or MA and the combined ARMA type. An ARMA(p,q) process can be written as (Priestley, 1981)

where e_n is a purely random process, thus a sequence of independent indetically distributed stochastic variables with zero mean and variance s_e². This process is purely AR for q=0 and MA for p=0. Any stationary stochastic process can be written as a unique AR(¥) or MA(¥) process The roots of

are denoted as the poles of the ARMA(p,q) process, and the roots of

are the zeros. Processes and models are called stationary if all poles are strictly within the unit circle, and they are invertible if all zeros are within the unit circle.

AR estimation

This model type is the backbone of time series analysis in practise. Burg's method, also denoted as maximum entropy, estimates the reflection coefficients (Burg, 1967;Kay and Marple, 1981), thus making sure that the model will be stationary, with all roots of A(z) within the unit circle. Asymptotic AR order selection criteria can give wrong orders if candidate order are higher than 0.1N (N is the signal length). The finite sample criterion CIC(p) is used for model selection (see Broersen, 2000). The model with the smallest value of CIC(p) is selected. CIC uses a compromise between the finite sample estimator for the Kullbach-Leibler information (Broersen and Wensink, 1998) and the optimal asymptotic penalty factor 3 (Broersen, 2000,Broersen and Wensink, 1996).

Box-Cox transformations

The Box Cox transformation (Box and Cox, 1964) can be applied in the order selection and estimation of the coefficients. The object in doing so is usually to make the residuals more homoskedastic and closer to a normal distribution:

for l not equal to zero, when l=0 T(y) =log(y) .

Application of the Module

The implemented algorithm computes AR(p) models with p=0,1,...,N/2 and selects a single best AR model with CIC. However, one can choose to provide the order one wants to use. Usually the mean of the signal will be extracted from the signal to obtain the model and coefficients, but this option can be switched off. It is recommended to use the subtraction of the mean. Figure 1 shows an example of using the implemented error module on the Moesel river basin at Cochem, Germany.

Figure 1. Application of AR module with subtraction of mean to the Moesel basin at Cochem, Germany. Blue is the measured discharge (Q), red is the updated model update and forecast, green is the model simulation. The forecasts starts at t=401 hours.

Optionally, one can choose to use the Box-Cox transformation. In the update the algorithm will provide an updated model update. During the forecast the selected model and coefficients are used for predicting the model error and are added with the model forecast to obtain an updated model forecast.

Figure 2. Application of AR module to the Moesel basin using Box Cox transformation and subtraction of mean. Blue is the measured discharge (Q), red is the updated model update and forecast, green is the model simulation. Forecasts starts at t=401 hours.

Figure 2 shows the effect of additionally applying a Box Cox transformation (l=0.3). It gives slightly better predictions than without (Figure 1).

Figure 3. Application of AR module to the Moesel basin using subtraction of mean. Blue is the measured discharge (Q), red is the updated model update and forecast, green is the model simulation. Forecasts starts at t=250 hours.

Figure 4. Application of AR module to the Moesel basin using subtraction of mean. Blue is the measured discharge (Q), red is the updated model update and forecast, green is the model simulation. Forecasts starts at t=500 hours.

Figure 3 and 4 show two applications (forecast starts at t=250 hours and at t=500 hours) of the algorithm with subtraction of mean but without Box-Cox transformation.

References

Box, G.E.P and D.R. Cox, 1964. An analysis of transformations. J. Royal Statistical Soc. (series B), vol 26, pp 211-252.

Broersen, P.M.T.; Weerts, A.H. (2005). Automatic Error Correction of Rainfall-Runoff models in Flood Forecasting Systems. Instrumentation and Measurement Technology Conference, 2005. IMTC 2005. Proceedings of the IEEE
Volume 2, Issue , 16-19 May 2005 Page(s): 963 - 968IMTC05river.pdf

Broersen, P.M.T., 2000. Finite sample criteria for Autoregressive order selection. IEEE Trans. Signal Processing, vol 48, pp 3550-3558.

Broersen, P.M.T. Automatic spectral analysis with time series models. IEEE Instr. Meas., vol 51, pp 211-216.

Broersen, P.M.T. Matlab toolbox ARMASA (online) Available: http://www.tn.tudelft.nl/mmr.

Broersen, P.M.T. and H.E. Wensink, 1996. On the penalty factor for autoregressive order selection in finite samples, vol 44, pp 748-752.

Broersen, P.M.T. and H.E. Wensink, 1998. Autoregressive model order selection by a finite sample estimator for the Kullbach-Leibler discrepancy. IEEE Trans. Signal Processing, vol 46, pp 2058-2061.

Burg, J.P., 1967. Maximum entropy spectral analysis. Proc. 37^th Meeting Soc. Exploration Geophys., Oklahoma City, OK, pp 1-6.

Kay, S.M. and S.L. Marple, 1981. Spectrum analysis-A modern perspective. Proc IEEE, vol 69, pp 1380-1419.

Madsen, H., M.B. Butts, S.T. Khu, S.Y. Liong, 2000. Data assimilation in rainfall-runoff forecasting. Hydroinformatics 2000, 4^th Inter. Conference on Hydroinformatics, Cedar Rapids, Iowa, USA, 23-27 July 2000, 9p.

Priestely, M.B., 1981. Spectral analysis and time series. New York:Academic.

Page tree

AR Module Background information