...
For data vector Xi, (i = 1, N), the basic statistics are determined as follows:
- minimum: Xmin = min(X1, X2, X3, ..., XN )
- maximum: Xmax = max(X1, X2, X3, ..., XN )
- arithmetic mean:
- median: middle value of the ranked series Xi
- mode: value of X which occurs with the greatest frequency; i.e. the middle value of the class with the greatest frequency. If classes have equal greatest frequency then the middle value of the class with the lowest class levels will be indicated as mode.
- standard deviation:
!05 - Time Series Analysis^image005.gif! - variance = sx*sx
- skewness:
!05 - Time Series Analysis^image007.gif! - kurtosis:
!05 - Time Series Analysis^image009.gif! - quantiles and deciles: the quantiles and deciles are computed with the function:
!05 - Time Series Analysis^image011.gif!
where k = the rank in sorted data array.
If k is not an integer Xk {~}k~ is interpolated between the two closest values - empirical frequency distribution: graphical presentation of number of data per class; number of classes and minimum and maximum class levels are input
- empirical cumulative frequency distribution: cumulative representation of frequencies per class. The relative cumulative frequencies are computed by:
!05 - Time Series Analysis^image013.gif!
As can be seen, the relative cumulative frequency is plotted with the Chegodayev function
For the mean and the variance also the 95% confidence intervals will be computed. The Student-t distribution is to be applied and the percentage points tn, a ~ ~a /2 and tn,1- a ~ ~a /2 .are computed, where n = N-1 is the number of degrees of freedom. The confidence limits for the mean then read:
!05 - Time Series Analysis^image015.gif!
Given an estimate of the sample variance the true variance sY {~}Y~ 2 will be contained within the following confidence interval with a probability of 100(1-a) %:
!05 - Time Series Analysis^image017.gif!
The values for c2 2^ n, a /2 and c2 2^ n,1- a /2 are read from the tables of the Chi-square distribution for given aand n.
The 95% confidence limits for the median, 25% quantile and 75% quantile are computed with:
!05 - Time Series Analysis^image019.gif!Selection of data
Data for statistical analysis is be read from the hymosdatabase.
...
For use of the Peaks Over Threshold (POT) and Peaks Under Threshold (PUT)-methods actual values have to be selected. For these methods a threshold value and a horizon are entered. For the POT option all values below the threshold will be excluded from computation, for the PUT option all values above the threshold will be excluded. The data used for further computation are all peaks between successive up-crossings and
Down-crossings taking into account the given horizon. The default value of the horizon is 1 (no horizon).
!05 - Time Series Analysis^image021.jpg!
In the picture you see four peaks on time steps t1 to t4, a given horizon and a threshold. First the highest peaks between an up-crossing and down-crossing are computed. Because peak at t4 is higher than t3 within the same crossing period, peak t3 is not seen as a real peak. When the horizon is set to 1 the POT method will return three peaks for the selected period, namely t1, t2 and t4. When a value for the horizon is entered larger than 1, HYMOS will skip all the lower peaks within the horizon period. In this example, peak t2 will be skipped and the POT method will return peak t1 and t4.