Page History

...

For data vector X_i, (i = 1, N), the basic statistics are determined as follows:

minimum: X_min = min(X₁, X₂, X₃, ..., X_N )
maximum: X_max = max(X₁, X₂, X₃, ..., X_N )
arithmetic mean:
median: middle value of the ranked series X_i
mode: value of X which occurs with the greatest frequency; i.e. the middle value of the class with the greatest frequency. If classes have equal greatest frequency then the middle value of the class with the lowest class levels will be indicated as mode.
standard deviation:
!05 - Time Series Analysis^image005.gif! Image Added
variance = sx*sx
skewness:
!05 - Time Series Analysis^image007.gif! Image Added
kurtosis:
!05 - Time Series Analysis^image009.gif!
quantiles and deciles: the quantiles and deciles are computed with the function:
!05 - Time Series Analysis^image011.gif! Image Added
where k = the rank in sorted data array.
If k is not an integer X_k {~}k~ is interpolated between the two closest values
empirical frequency distribution: graphical presentation of number of data per class; number of classes and minimum and maximum class levels are input
empirical cumulative frequency distribution: cumulative representation of frequencies per class. The relative cumulative frequencies are computed by:
!05 - Time Series Analysis^image013.gif! Image Added
As can be seen, the relative cumulative frequency is plotted with the Chegodayev function
For the mean and the variance also the 95% confidence intervals will be computed. The Student-t distribution is to be applied and the percentage points t_n, _a _{~ ~a} _/2 and t_n,1- _a _{~ ~a} _/2 .are computed, where n = N-1 is the number of degrees of freedom. The confidence limits for the mean then read:
!05 - Time Series Analysis^image015.gif! Image Added
Given an estimate of the sample variance the true variance s_Y {~}Y~ ² will be contained within the following confidence interval with a probability of 100(1-a) %:
!05 - Time Series Analysis^image017.gif! Image Added
The values for c² ^{2^ _n, _a _/2 and c}² 2^ _n,1- _a _/2 are read from the tables of the Chi-square distribution for given aand n.
The 95% confidence limits for the median, 25% quantile and 75% quantile are computed with:
!05 - Time Series Analysis^image019.gif! Image Added
Selection of data

Data for statistical analysis is be read from the hymosdatabase.

...

For use of the Peaks Over Threshold (POT) and Peaks Under Threshold (PUT)-methods actual values have to be selected. For these methods a threshold value and a horizon are entered. For the POT option all values below the threshold will be excluded from computation, for the PUT option all values above the threshold will be excluded. The data used for further computation are all peaks between successive up-crossings and
Down-crossings taking into account the given horizon. The default value of the horizon is 1 (no horizon).
!05 - Time Series Analysis^image021.jpg! Image Added
In the picture you see four peaks on time steps t1 to t4, a given horizon and a threshold. First the highest peaks between an up-crossing and down-crossing are computed. Because peak at t4 is higher than t3 within the same crossing period, peak t3 is not seen as a real peak. When the horizon is set to 1 the POT method will return three peaks for the selected period, namely t1, t2 and t4. When a value for the horizon is entered larger than 1, HYMOS will skip all the lower peaks within the horizon period. In this example, peak t2 will be skipped and the POT method will return peak t1 and t4.

Child pages

Versions Compared

Old Version 6

New Version 7

Key

Selection of data