06 Time series analysis

Time series analysis

General

Time series analysis includes the execution of following types of analysis:

correlation analysis
spectral analysis
range analysis
run analysis
storage analysis
Confidence Interval of Mean

The analyses are described in the next Sections.

important:
Note that the time series which are investigated should not contain missing values!!

Correlation analysis

Correlation analysis covers the computation of:

auto-covariance function
auto-correlation function
cross-covariance function
cross-correlation function

HYMOS produces tabular and graphical presentations of these functions.

Auto-covariance and auto-correlation functions

For the series x_i, i = 1, N the auto-covariance function c_xx(k), k = 0, L_max is computed as follows:

where:
m_x = average of x_i, i = 1, N.
k = time-lag in time units equal to the time interval
L_max = maximum lag

The auto-correlation function r_xx(k) is determined from:

r_xx (k) = c_xx (k)/c_xx (0)

The 95% tolerance or confidence limits for zero correlation according to Siddiqui (see Yevjevich, 1972) are computed from:

where:
CL_p(k) = upper confidence limit for zero correlation at lag k
CL_n(k) = lower confidence limit for zero correlation at lag k

notes:

The largest time lag in the correlogram should be less than N/2; in hymosits maximum is limited to 100.
In the estimator for c~xx~ (k), the divisor is N rather then N-k, since the former have smaller mean square error; for the interpretation of the correlogram one has to keep in mind that this approach has a decaying effect on the estimate at increasing time lag.
The tolerance or confidence limits can be selected by the user by entering a percentage.

Example

An example of an auto-correlogram of monthly rainfall, shown in the underneath Figure, is presented in the next table and figure.

Figure: Time series of monthly rainfall

Time series analysis
====================
Autocovariance and autocorrelation analysis
============================================
Series = 170203         PH
Date of first element                                    = 1967 1 0 0 1
Date of last element                                    = 1986 12 0 0 1

     COV      = autocovariance function
     COR      = autocorrelation function
     CLP       = upper conf. limit zero correlation (95 %)
     CLN       = lower conf. limit zero correlation (95 %)

              LAG           COV        COR        CLP       CLN
                  0     .2086E+05     1.0000     .1213     -.1296
                  1     .1251E+05       .5995   .1216     -.1299
                  2     .6325E+04       .3032     .1218    -.1302
                  3    -.5033E+03     -.0241     .1220     -.1304
                  4    -.6600E+04     -.3163     .1223     -.1307
                  5    -.1204E+05     -.5771     .1225     -.1310
.. ......... ..... ..... ......

Table: Example of output: auto-correlogram of monthly rainfall

Figure: Auto-correlogram of monthly rainfall

Cross-covariance and cross-correlation functions

The cross-covariance functions c_xy(k) and c_yx(k), k = 0, L_max are computed as follows:

where:
m_x = average of x_i, i = 1, N
m_y = average of y_i, i = 1, N.

The cross-correlation functions r_xy(k) and r_yx(k) are estimated from:

r _xy (k) = c _xy (k)/(s _x .s _y )
r _yx (k) = c _yx (k)/(s _x .s _y )

where:
s_x = standard deviation of x_i, i = 1, N
s_y = standard deviation of y_i, i = 1, N.

notes:

The largest time lag in the correlogram should be less than N/2; in hymosits maximum is limited to 100.
In the estimators for c_xy(k and c_yx(k) the divisor is N rather then N-k, since the former have smaller mean square error; for the interpretation of the correlogram one has to keep in mind that this approach has a decaying effect on the estimate at increasing time lag.

Spectral analysis

The smoothed auto-spectral estimate C _xx (f), for f=0,..,1/2 is calculated from:

where:
f = frequency in cycles per time interval, computed at spacings 1/(2N_f ), where N_f is 2 to 3 times M
N_f = number of frequency points
c_xx (k) = autocovariance function at lag k
M = truncation point or maximum lag of the autocovariance function used to estimate the autospectrum; clearly M is conditioned by: M ≤ L_max(see Section XII.2.2)
w(k) = window function

Following window w(k) for k = 1, M-1 according to Tukey is used to smooth the spectral estimate:

The bandwith B and number of degrees of freedom N_df are given by:

B = 4/(3M)
N_df = 8N/(3M)

The logarithm of the auto-spectrum is computed by:

C _log(f) = log¹⁰ C_xx (f)

In the results C_log(f) will be set to -100 if C_xx(f) ≤ 0.

The spectral density function follows from:

The tolerance or confidence limits for white noise are computed from:

where:
CL_p = upper confidence limit for white noise
CL_n = lower confidence limit for white noise
a = tolerance or confidence interval, default 95%

notes:

Truncation point M: An important aspect of the estimation of the spectrum is the choice of the bandwidth B = 1.33/M, which implies the choice of the truncation point of the covariance function used to compute the spectrum. Suppose it is required to detect details of width w in the spectrum. Then the truncation point M should be chosen so that the bandwidth B is less than w, i.e. B < w, or M > 1.33/w
Frequency points N _f : It has been suggested that C _xx (f) and R _xx (f) should only be computed at values of f corresponding to f = 0,1/N _f ,2/N _f ,..,1/2. However, Jenkins and Watts (1968) indicate that this spacing is too wide, and recommend that C _xx (f) and R _xx (f) be evaluated at some fraction of this spacing so that a more detailed plot is obtained: N _f = 2 to 3 times M.

Example

An example is presented in the underneath figure, where the spectral density function of monthly rainfall given is shown.

Figure Example of output: Spectral density function of monthly rainfall

Range analysis

In the underneath figure a definition sketch of the following range related quantities is given:

adjusted surplus _aS_N⁺
adjusted deficit _aS_N^-
adjusted range _aR_N,
rescaled adjusted range _aR_N^*.

Figure Definition sketch range quantities

The quantities are computed from the accumulative departures from the mean S_i for i = 0, N and with S₀ = 0:

where:
m_x = average of x_i, i = 1, N
c_f = conversion factor (time units per time interval) to transfer intensities into volumes

It follows for:

• Surplus _aS_N⁺:

• Deficit _aS_N^-:

• Adjusted range _aR_N:

• Rescaled adjusted range _aR_N^*:

_aS_N⁺ = max (S₀, S₁,..., S_N)

_aS_N^- = min (S₀, S₁,..., S_N)

_aR_N = _aS_N⁺ - _aS_N^-

_aR_N = R_N /(s_x.c_f)

where:

s_x = standard deviation of x_i, i = 1, N

Run analysis

A definition sketch for run analysis is presented in the underneath figure.

Up- and downcrossing and runs

Let x_c be a crossing level then an upcrossing is defined by:

x_i+1³ x_c and x_i < x_c

and a downcrossing by:

x_i+1 < x_c and x_i³ x_c

A run is and excursion above or below the level x_c, i.e. bounded by an upcrossing and a downcrossing or a downcrossing and an upcrossing. Note: hymosalso interprets as runs the first and last excursion above or below level x_c, which are only bounded by an upcrossing or a downcrossing; these runs are incomplete.

Figure Definition sketch for run analysis

Runlength

With respect to runlength, the following distinction has to be made:

positive runlength RL^+^
negative runlength RL^-^,
total runlength, i.e. successive pair of RL^+^ + RL^-^

RL^+^ = the time span between an upcrossing and a downcrossing, given as a number of time intervals and
RL^-^ = the time span between a downcrossing and an upcrossing given as a number of time intervals.

Runsum

The positive and negative runsums RS^+^ and RS^-^, respectively, are computed from:

where:
j = location of an upcrossing
k = location of the next downcrossing
c_f = conversion factor (= time units per time interval) to transfer intensities into volumes

where:
k = location of the downcrossing
m = location of the next upcrossing

Storage Analysis

Water shortage or equivalently storage requirements without running dry are computed for various draft levels from the reservoir. The procedure is a computerised variant of the well known graphical Rippl technique. The algorithm considers the following sequence of storages:

S_i = S_i-1 + (x_i - D_x )c_f i = 1, 2N ; S₀ = 0

where:
x_i = inflow
D_x = D_L .m _x
m_x = average of x_i, i = 1, N
D_L = draft level as a fraction of m _x
c_f = multiplier to convert intensities into volumes (time units per time interval)

The local maximum of S_i larger than the preceding maximum is sought. Let the locations be k2 and k1 respectively with k2 > k1. Then the largest non-negative difference between S_k1 and S_i, i = k1..., k2,... is determined, which is the local range. This procedure is executed for two times the actual series x_i; hence the series x_i~ is used twice in sequence: x_i = x_N+i. In this way initial effects are eliminated.

References

Yevjevich, V.: Stochastic processes in hydrology, Water Resources Publications, Fort Collins, 1972
Jenkins, G.M. and D.G. Watts: Spectral analysis and its applications, Holden Day, San Day, San Francisco, 1968

Confidence Interval of Mean

After calculating the mean it is useful to estimate the confidence range of the mean. The bounds of this confidence interval are the confidence limits of the mean. The width of the interval increases when:

more confidence is requested, e.g. 99% (= 0.01) instead of 95% (= 0.05);
the standard deviation is higher
the sample size is smaller
serial correlation is present

Calculation

where:
_ucl = upper confidence limit of average for 1-z confidence level
_lcl = lower confidence limit of average for 1-z confidence level
x = average of sample (calculated)
= error level
t, n-1 = critical value of student - t distribution for a confidence level of 1-
s = standard deviation of sample (calculated)
n = square root of number of measurements in sample

Correction for effective measurement

HYMOS offers the option of using the number of effective (n*) instead of the total number of the measurements in the sample (n) . The number of effective measurements decreases if there is a higher serial correlation present in the data series according to:

where:
n = number of measurements in sample
n* = number of effective (or independent) measurements in sample
r_k = auto-correlation coefficient at lag(k)

Auto-correlation

where
c_k = auto-covariance at lag(k)
c₀ = variance at lag(k)
r_k = auto-correlation coefficient at lag(k)
n = number of measurements in sample
x_i,i+k = measurement at time t, t+k
m_x = average of the measurements

Child pages

06 Time series analysis

Time series analysis

General

Correlation analysis

Auto-covariance and auto-correlation functions

Cross-covariance and cross-correlation functions

Spectral analysis

Range analysis

Run analysis

Up- and downcrossing and runs

Runlength

Runsum

Storage Analysis

References

Confidence Interval of Mean

Calculation

Correction for effective measurement