Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

For data vector Xi, (i = 1, N), the basic statistics are determined as follows:

  • minimum: Xmin = min(X1, X2, X3, ..., XN )
  • maximum: Xmax = max(X1, X2, X3, ..., XN )
  • arithmetic mean:
  • median: middle value of the ranked series Xi
  • mode: value of X which occurs with the greatest frequency; i.e. the middle value of the class with the greatest frequency. If classes have equal greatest frequency then the middle value of the class with the lowest class levels will be indicated as mode.
  • standard deviation:
    !05 - Time Series Analysis^image005.gif! Image Added
  • variance = sx*sx
  • skewness:
    !05 - Time Series Analysis^image007.gif! Image Added
  • kurtosis:
    !05 - Time Series Analysis^image009.gif!
  • quantiles and deciles: the quantiles and deciles are computed with the function:
    !05 - Time Series Analysis^image011.gif! Image Added
    where k = the rank in sorted data array.
    If k is not an integer Xk {~}k~ is interpolated between the two closest values
  • empirical frequency distribution: graphical presentation of number of data per class; number of classes and minimum and maximum class levels are input
  • empirical cumulative frequency distribution: cumulative representation of frequencies per class. The relative cumulative frequencies are computed by:
    !05 - Time Series Analysis^image013.gif! Image Added
    As can be seen, the relative cumulative frequency is plotted with the Chegodayev function
    For the mean and the variance also the 95% confidence intervals will be computed. The Student-t distribution is to be applied and the percentage points tn, a ~ ~a /2 and tn,1- a ~ ~a /2 .are computed, where n = N-1 is the number of degrees of freedom. The confidence limits for the mean then read:
    !05 - Time Series Analysis^image015.gif! Image Added
    Given an estimate of the sample variance the true variance sY {~}Y~ 2 will be contained within the following confidence interval with a probability of 100(1-a) %:
    !05 - Time Series Analysis^image017.gif! Image Added
    The values for c2 2^ n, a /2 and c2 2^ n,1- a /2 are read from the tables of the Chi-square distribution for given aand n.
    The 95% confidence limits for the median, 25% quantile and 75% quantile are computed with:
    !05 - Time Series Analysis^image019.gif! Image Added

    Selection of data

Data for statistical analysis is be read from the hymosdatabase.

...

For use of the Peaks Over Threshold (POT) and Peaks Under Threshold (PUT)-methods actual values have to be selected. For these methods a threshold value and a horizon are entered. For the POT option all values below the threshold will be excluded from computation, for the PUT option all values above the threshold will be excluded. The data used for further computation are all peaks between successive up-crossings and
Down-crossings taking into account the given horizon. The default value of the horizon is 1 (no horizon).
!05 - Time Series Analysis^image021.jpg! Image Added
In the picture you see four peaks on time steps t1 to t4, a given horizon and a threshold. First the highest peaks between an up-crossing and down-crossing are computed. Because peak at t4 is higher than t3 within the same crossing period, peak t3 is not seen as a real peak. When the horizon is set to 1 the POT method will return three peaks for the selected period, namely t1, t2 and t4. When a value for the horizon is entered larger than 1, HYMOS will skip all the lower peaks within the horizon period. In this example, peak t2 will be skipped and the POT method will return peak t1 and t4.