Series homogeneity tests


General

Dependent on the type of analysis series must fulfil one or more of the following requirements:

  • stationarity: i.e. the properties or characteristics of the series do not vary with time;
  • homogeneity: i.e. all elements of a series belong to the same population;
  • randomness: i.e. series elements are independent.
    hymosincludes following statistical tests to investigate series' stationarity, homogeneity or randomness:
    1. Median run test : a test for randomness by calculating the number of runs above and below the median;
    2. Turning point test : a test for randomness by calculating the number of turning points;
    3. Difference sign test : a test for randomness by calculating the number of positive and negative differences;
    4. Spearman rank correlation test : the Spearman rank correlation coefficient is computed to test:
  • the existence of correlation between two series,
  • the significance of serial rank correlation, and
  • the significance of a trend;
    5. Spearman rank trend test
    6. Arithmetic serial correlation coefficient: a test for serial correlation;
    7. Wilcoxon-Mann-Whitney U-test: a test to investigate whether two series are from the same population;
    8. Student t-test: a test on difference in the mean between two series;
    9. Wilcoxon W-test: a test on difference in the mean between two series;
    10. Linear trend test: a test on significance of linear trend by statistical inference on slope of trend line;
    11. Rescaled adjusted range test: a test for series homogeneity by the rescaled adjusted range.
    12. Shapiro-Wilk W test
    13. Rosner's Test for outliers
    14. Sign Test: a test on the positive difference in pairs between two series;
    15. Wilcoxon Rank Sum Test: a non-parametric test for independent data sets;
    16. Wilcoxon Signed Rank Test
    17. Seasonal Kendall Slope Estimator and Test
    18. Dixon's Test for outliers
    19. Lettenmaier Trend Test

    notes
    1. The Spearman rank correlation test may be used as a single or two series test; in the single series mode it tests the significance of correlation with time (Spearman rank trend test).
    2. Tests nrs. 6, 7 and 8 (Wilcoxon-Mann-Whitney U-test, Student t-test and Wilcoxon W-test) are basically two-series tests; however, the test can also be used for a single series by means of the split-sample approach, where a series is divided into two parts, which are mutually compared.

    Series codes

    Series can be selected by clicking the checkbox of the series, for one series test only one series can be selected, for two series tests two series can be selected.

    Test Option

    Choose a statistical test from the list box, after pressing <Execute> the result of the test will be presented behind the test option.

    Hypothesis testing

    A statistical hypothesis is an assumption about the distribution of a statistical parameter. The assumption is stated in the null-hypothesis H0 and is tested against an alternative formulated in the H1 hypothesis. For easy reference the parameter under investigation is usually presented as a standardised variety, called test statistic . Under the null-hypothesis the test statistic has some standardised sampling distribution, e.g. a standard normal, a Student t-distribution, etc. For the null-hypothesis to be true the value of the test statistic should be within the acceptance region of the sampling distribution of the parameter under the null-hypothesis. If the test statistic does not lie in the acceptance region, the null-hypothesis is rejected and the alternative is assumed to be true. Some risk, however, is involved that we make the wrong decision about the test:
  • Type I error, i.e. rejecting H0 when it is true, (producer's risk), and
  • Type II error, i.e. accepting H0 when it is false, (consumer's risk).
    The probability of making a Type I error is equal to the significance level of the test a. When a test is performed at a 0.05 or 5% level of significance it means that there are about 5% chance that the null-hypothesis will be rejected when it should have been accepted. This probability represents the critical region at the extreme end(s) of the sampling distribution under H0 . Note, however, the smaller the significance level is taken, the larger becomes the risk of making Type II error and the less is the discriminative power of the test.
    Depending on the type of alternative hypothesis H1 one- or two-sided tests are considered. This is explained by the following example. Let in a test for randomness the correlation coefficient r be the test statistic. The null-hypothesis reads H0 : r=0 against one of the following alternatives:
    1. H1 : r>0, i.e. a right-sided test
    2. H1 : r<0, i.e. a left-sided test
    3. H1 : r¹0, i.e. a two-sided test
    Let the tests be performed at a significance level aand let the theoretical value of r under the null-hypothesis is denoted by r, then H0 will not be rejected in:
    1. a right-sided test, if: r£r(1-a)
    2. a left-sided test, if: r³r(a)
    3. a two-sided test, if: r(a/2)£r£r(1-a/2)
    For a symmetrical distribution the last expression may be replaced by
    ½r½£r(1-a/2)

    Median run test

    In the median run test the number of runs Nr of series Ai , (i=1,N) above and below the median is counted. A run is defined as an excursion above or below the median, denoted by Am :
  • a positive run is bounded by an upcrossing and downcrossing,
  • a negative run is bounded by a downcrossing and an upcrossing.
    Note that values equal to Am are deleted from the series. If the number of values above and below the median are denoted by m and n respectively (where m = n is not necessarily true due to possible deletion), the quantity Nr for a random series is asymptotically normally distributed with N(mr ,sr ):
    mr = 2mn/(m+n) + 1
    sr 2 = 2mn{2mn-(m+n)}/{(m+n)2 (m+n-1)}
    The normal approximation holds for m and n>20. For smaller values of m and n Table VII-5.1 may be used to obtain critical values of Nr at a 5% significance level. The differences with the normal approximation are, however, small.
    hymosconsiders the following hypothesis:
  • H0 : series Ai is random, and
  • H1 : series is not random, with no direction for the deviation of randomness; hence, a two-tailed test is performed
    and the absolute value of the following standardised test statistic is computed:
    ½u½= ½Nr + c - mr ½/sr
    where: c = a continuity correction: c=0.5 if Nr <mr , and c=-0.5 if Nr >mr
    For the interpretation of the test results reference is made to the section about Hypothesis testing.
    Test conditions:
    N ³10
    N-(m+n) £N/2
    Reference:
  • Kreyzig, E.: Introductory Mathematical Statistics, Principles and Methods. John Wiley & Sons, Inc, New York, 1970
  • Siegel S. and N.J. Castellan: Non-parametric statistics for the behavioural sciences. Mc Graw-Hill Book Company, 2nd ed.,1988, pp. 58-64
    note
    Any observed value of Nr , which is less than or equal to the smaller value, or is greater than or equal to the larger value is significant at the 0.05 significance level

    Turning point test

    In a series Ai ,(i=1,N) a turning point is defined whenever
    A(i-1) < A(i) > A(i+1) and A(i-1) > A(i) < A(i+1) ,
    hence whenever a peak or a trough occurs.
    In an independent stationary series of length N the number of turning points Nt is asymptotically normally distributed with N(mt ,st ):
    mt = 2(N-2)/3
    st 2 = (16N-29)/90
    hymosconsiders the following hypothesis:
    H0 : series Ai is random, and
    H1 : series is not random, with no direction for the deviation of randomness; hence, a two-tailed test is performed
    and the absolute value of the following standardised test statistic is computed:
    ½u½= ½Nt - mt ½/st
    For the interpretation of the test results reference is made to the section about Hypothesis testing.
    Test condition:
    N³10
    Reference:
  • Yevjevich, V: Stochastic Processes in Hydrology Water Resources Publications, Fort Collins, Colorado, 1972, pp. 215

    Difference sign test

    The difference-sign test counts the number of positive differences Np and of negative differences Nn between successive values of series Ai ,(i=1,N): A(i+1) -A(i) . Let the maximum of the two be given by Nds :
    Nds = max (Np ,Nn )
    For an independent stationary series of length Neff (Neff = N - zero differences) the number of negative or positive differences is asymptotically normally distributed with N(mds ,sds ):
    mds = (Neff -1)/2
    sds 2 = (Neff +1)/12
    hymosconsiders the following hypothesis:
    H0 : series Ai is random, and
    H1 : series is not random, with no direction for the deviation of randomness; hence, a two-tailed test is performed
    and the absolute value of the following standardised test statistic is computed:
    ½u½= ½Nds - mds ½/sds
    For the interpretation of the test results reference is made to the section about Hypothesis testing.
    Test condition:
    N³10
    Reference:
  • Yevjevich, V: Stochastic Processes in Hydrology Water Resources Publications, Fort Collins, Colorado, 1972, pp. 215

    Spearman rank correlation test

    The association between two series of length N is measured by means of the Spearman rank correlation coefficient rs . The observations on each series are ranked from 1 to N. Tied observations are assigned the average of the tied ranks. The sum of squares of rank differences is calculated:

    where:
    Ai = first ranked vector
    Bi = second ranked vector
    N = length of series
    A correction factor for ties is obtained:
    TA = S(t3 - t)/12 over series A
    TB = S(t3 - t)/12 over series B
    where: t = number of observations tied for a given rank.
    The Spearman rank correlation coefficient rs is then computed for the following two cases:
    (a) if TA and TB are zero,
    rs = 1 - 6D/(N3 - N)
    (b) if TA and/or TB are not zero,
    rs = (X + Y - D)/2Ö(XY)
    where:
    X = (N3 - N)/12 - TA
    Y = (N3 - N)/12 - TB
    The statistic used to measure the significance of rs is:
    ½t½ = ½rs ½Ö{(N - 2)/(1 - rs 2 )}
    Under the null-hypothesis of zero correlation the theoretical value of the test statistic has a Student t -distribution with Ndf = N-2 degrees of freedom for N > 10.
    For the interpretation of the test results reference is made to the section about Hypothesis testing.
    The Spearman rank correlation test is basically a two-sample test. However it can be used as a one-sample test as well in two ways:
    1. to measure the significance of serial correlation between series A1i ,(i=1,N-1) and series A2i ,(i=2,N). The second series in the test is equal to the first series, only shifted one time step.
    2. to test the significance of a trend where series Ai is compared with time: Bi ,(i=1,N) = 1,2,3,....,N. In the Series homogeneity test list box this function is shown as the Spearman rank trend test.
    Test condition:
    N³10
    Reference:
  • Siegel, S. and N.J. Castellan: Non-parametric statistics for the behavioural sciences. Mc. Graw-Hill Book Company, 2nd ed., 1988, pp.235-244

    Spearman rank trend test

    The calculations in this function are explained in the Spearman rank correlation test function. However the significance of a trend is computed where the series is compared with time (see one sample test in Spearman rank correlation test).

    Arithmetical serial correlation test

    To test for absence of serial dependence in series Ai ,(i=1,N) the arithmetical serial correlation coefficient r1 is computed:

    with: mA = mean of Ai
    The statistic used to measure the significance of r1 is:
    ½t½ = ½r1 ½Ö{(N - 3)/(1 - r1 2 )}
    Under the null-hypothesis of zero correlation the theoretical value of the test statistic has a Student t -distribution with Ndf = N-3 degrees of freedom for N > 10.
    For the interpretation of the test results reference is made to the section about Hypothesis testing.
    Test condition:
    N³10

    Wilcoxon-Mann-Whitney U-test

    The Wilcoxon-Mann-Whitney U-test investigates whether two series of length m and n respectively (which may be split-samples of one series) are from the same population. The data of both series are ranked together in ascending order. Tied observations are assigned the average of the tied ranks. The sum of ranks in each of the series, Sm and Sn , is then calculated.
    The U statistic is then computed as follows:
    Um = mn + m(m+1)/2 - Sm
    Un = mn - Um
    U = min(Um ,Un )
    Correction for ties are incorporated as follows:
    1. tied observations are given the average rank, and
    2. the variance of U is corrected by a factor F1 :
    F1 = 1 - S(t3 - t)/(N3 - N)
    where:
    t = number of observation tied for a given rank
    If the elements in the two series belong to the same population then U is approximately normally distributed with N(mU ,sU ):
    mU = mn/2
    sU 2 = F1 .mn(N+1)/12
    where:
    N = m+n
    hymosconsiders the following hypothesis:
    H0 : series are from the same population, and
    H1 : series are not from the same population; a two-tailed test is performed
    and the absolute value of the following standardised test statistic is computed:
    ½u½= ½U + c - mU ½/sU
    where: c = a continuity correction; c=0.5 if U<mU , and c=-0.5 if U>mU
    For the interpretation of the test results reference is made to the section about Hypothesis testing.
    Test conditions:
    N ³20, m³5 and n³5
    Reference:
  • Siegel, S. and N.J. Castellan: Non-parametric statistics for the behavioural sciences. Mc. Graw-Hill Book Company, 2nd ed., 1988, pp.235-244

    Student t-test

    With the Student t-test differences in mean values of two series Ai ,(i=1,m) and Bi ,(i=1,n) are investigated. Let mA and mB denote the sample values of population means of A and B: mA and mB .
    hymosconsiders the following hypothesis:
    H0 : mA = mB , and
    H1 : mA ¹ mB ,
    Hence a two-tailed test is performed and the absolute value of the following standardised test statistic is computed:
    ½t½= ½mA - mB ½/sAB
    Under the null-hypothesis of equal population means the theoretical value of the test statistic has a Student t -distribution with Ndf = m+n-2 degrees of freedom for N = m + n > 10.
    For the interpretation of the test results reference is made to the section about Hypothesis testing.
    The way the standard deviation sAB is computed depends on whether the series A and B have the same population variance. For this a Fisher F-test is performed on the ratio of the variances.
    hymosconsiders the following hypothesis:
    H0 : sA 2 = sB 2 , and
    H1 : sA 2 ¹ sB 2 , hence a two-tailed test is performed.
    Following test statistic is considered:
    Q = sA 2 /sB 2
    For the interpretation of the test results reference is made to Section VII.5.2.
    The standard deviation sAB is computed from:
    1. in case of equal variances:

    2. in case of unequal variances:

    and the number of degrees of freedom Ndf is given by:

    with y:

    Practically it implies that Ndf becomes less than in the equal variance case, so the discriminative power of the test diminishes somewhat.
    Test conditions:
    N³10, m³5 and n³5
    Reference:
  • Hald, A.: Statistical theory with engineering applications, John Wiley, New York, 1952

    Wilcoxon W-test

    The Wilcoxon test considers two series Ai ,(i=1,m) and Bi ,(i=1,n) and tests the significance of differences in the mean.
    All values Ai are compared with all Bj , where wi,j is defined by:
    Ai < Bj : wi,j = 2
    Ai = Bj : wi,j = 1
    Ai > Bj : wi,j = 0
    The Wilcoxon statistic W is formed by:

    In case mA = mB the W-statistic is asymptotically normally distributed with N(mW ,sW ):
    mW = mn
    sW 2 = mn(N+1)/3
    where: N = m+n
    hymosconsiders the following hypothesis:
    H0 : mA = mB , and
    H1 : mA ¹ mB , hence, a two-tailed test is performed
    and the absolute value of the following standardised test statistic is computed:
    ½u½= ½W - mW ½/sW
    For the interpretation of the test results reference is made to the section about Hypothesis testing.
    Test conditions:
    m³10 and n³10

    Linear trend test

    The slope of the trend line of series Ai ,(i=1,N) with time or sequence is investigated. The linear trend equation reads:
    Ai = b1 + b2 .i + ni with ni »N(0,sn 2 ), i = 1, N
    where:
    b1 = mA - b2 .mi
    b2 = cov(A,i)/si 2


    with: mA = sample mean of series Ai
    mi = mean of series i=1,2,...,N = = (N+1)/2
    hymosconsiders the following hypothesis:
    H0 : no trend, i.e. mb2 = 0, and
    H1 : significant trend, i.e. mb2 ¹0, hence a two-tailed test is performed
    and the absolute value of the following standardised test statistic is computed:
    ½t½= ½b2 ½/sb2


    Under the null-hypothesis of no trend, the theoretical value of the test statistic has a Student t -distribution with Ndf = N-2 degrees of freedom for N > 10.
    For the interpretation of the test results reference is made to the section about Hypothesis testing.
    Test condition:
    N³10
    Reference:
  • Yevjevich, V.: Stochastic Processes in Hydrology, Water Resources Publication, Fort Collins, 1972

    Range test

    The range test investigates the homogeneity of series.
    The test statistic a RN /ÖN is computed, where N is number of elements in series A and a RN is the rescaled adjusted range with:
    a RN = RN /sA
    where:
    RN = S+ - S-
    sA 2 = S(Ai - mA )2 /N
    S+ = max(S0 , S1 ,....,SN ) with S0 = 0
    S- = min(S0 , S1 ,....,SN ) with S0 = 0

    and: mA = sample mean of Ai
    The null-hypothesis that series Ai is homogeneous is not rejected at a significance level a, using a one-tailed test, if:
    a RN /ÖN < Ra
    where Ra is the critical value derived from the asymptotic distribution of a RN given by Feller (1951), available for 0.01£a£0.1 and N³10.
    Test condition:
    N ³10
    References:
  • Feller, W.: The asymptotic distribution of the range of sums of independent random variables. Ann. Math. Stat., 22, 1951, pp. 427-432
  • Wallis, J.R. and P.E. O'Connell: Firm reservoir yield. How reliable are historic hydrological records?, Hydrological Science Bulletin, XVIII, 347-365, 1973.
  • Buishand, T.A.: Some methods for testing the homogeneity of rainfall records. KNMI, Memorandum FM-81-15, June, 1981.

    Shapiro and Wilk W-test

    The Shapiro and Wilk W-test is a powerful test specifically designed for small sample numbers (n<=50). An aspect of the power of this test is its suitability for various data types and its freedom from restrictive requirements. Its main disadvantage is its limitation to sample sets of 50 or less.
    First HYMOS checks the selected data set on missing values, if present they are removed from the set. After this filtering the data are arranged on ascending order and the sum of the squared data value deviations from the mean are computed:

    Compute k , where:
    k = n/2 if n is even
    k = (n-1)/2 if n is uneven
    The Shapiro and Wilk W-test makes use of W-test coefficients in statistical tables. These tables are also included in HYMOS, and used in the calculation.
    The W statistic is computed by applying these coefficients to the ranges between the two endsof the order statistics:

    The critical value for Wcrit for significance level aand the number of data n is taken from a statistical table. This critical value Wcrit is compared with the calculated value W.
    Hymosconsiders the following hypothesis:
    H0 : series could come from a normal population (W < Wcrit ), and
    H1 : series are not from a normal population (W > Wcrit ).
    If the test is rejected, the raw data would be better approximated by an alternative, non-normal distribution and are not suitable for parametric statistical analysis.
    Reference:
  • Water Quality Assessments. A guide to the use of biota/sediments and water in environmental monitoring. D. Chapman(ed.)/1992. Chapman & Hall.

    Rosner's Test

    The Rosner's test is used to identify up to 10 outliers. The test assumes that the population has a normal distribution. If a lognormal distribution is more plausible, all computations should be performed on the logarithms of the data. Therefore a new series must be created and logarithmic values computed with the Series Transformation function. Rosner's approach is designed to avoid masking of one outlier by another. Masking occurs when an outlier goes undetected because it is very close in value to another outlier.
    The maximum number of outliers detected is 10. The procedure repeatedly delete the value farthest from the mean and recompute the test statistic after each deletion. A table is used to evaluate the test statistic when n ³25 and n £5000.
    Rosner's test is two-tailed since the procedure identifies either suspiciously large or suspiciously small data.
    Hymosconsiders the following hypothesis:
    H0 : The entire data set is from a normal distribution, there are no outliers
    H1 : The data set contains outliers.
    The following procedure is followed:
    1. Compute the mean m and standard deviation s from the data set.
    2. Compute the Rosner's test statistic R .

    4. where x i is the most outlying value.
    5. Retrieve the tabled critical value l.
    6. Compare the two test values and reject H0 when R > l.
    7. Remove the outlier from the data set and redo the total procedure starting from step 1 for a total of 10 most outlying values.
    Reference:
  • Statistical Methods for Environmental Pollution Monitoring, R.O. Gilbert, 1987, John Wiley & Sons Inc..

    Sign Test

    The sign test is a two series two-sided test. The sign test statistic, B , is the number of pairs (x1i , x2i ) for which x1i < x2i , that is, the number of positive differences D i . The magnitudes of the D i . are not considered; only their signs are. If any D i . is zero so that a + or - sign cannot be assigned, this data pair is dropped from the data set and n is reduced by 1. Also pairs for which one series has a missing value is dropped from the data set. The statistic B is used to test the null hypothesis:
    H0 : The median of the population of all possible differences is zero, that is, x1i is likely to be larger than x2i as x2i , is likely to be larger than x1i .
    If the number + and - signs are about equal, there is little reason to reject H0 .
    The alternative hypothesis, H1 is:
    H1 : The median difference does not equal zero, that is, x1i is more likely to exceed x2i than x2i , is likely to exceed x1i .
    The number of valid pairs must be larger than 20, if so the following test procedure is used:
    1. Compute

    2. Reject H0 and accept H1 if ZB £- Z1- a /2 or ZB ³- Z1- a /2 .
    Reference:
  • Statistical Methods for Environmental Pollution Monitoring, R.O. Gilbert, 1987, John Wiley & Sons Inc..

    Wilcoxon Rank Sum Test

    The Wilcoxon Rank Sum Test is a non-parametric test for independent data sets. The Wilcoxon rank sum test may be used to test for a shift in location between independent series, that is, the measurements from one series tend to be consistently larger (or smaller) than those from the other series. The rank sum test hat the advantage that the two data sets need not to be drawn from normal distributions, and the test can handle a moderate number of ND values by treating them as ties. The test assumes, however, that the distributions of the two series are identical in shape (variance), but the distributions need not be symmetric.
    Hymosconsiders the following hypothesis:
    H0 : The populations from which the two series have been drawn have the same mean,
    H1 : The populations have different means.
    The following procedure is followed:
    1. Combine the two series and rank the m (= n 1 + n 2 ) data, assign rank 1 to the smallest value, rank two to the next, .., and the rank m to the largest value. If several data have the same value, assign them the midrank, that is, the average of the ranks that would otherwise be assigned to those data. When missing values are encountered the function is stopped. The number of values per series must be larger than 10.
    2. Sum the Ranks assigned to the first series, W rs .
    3. If no ties (same values) are present, the large sample statistic is computed from:

    4. If ties are present, the following function is used:

    where g is the number of tied groups and tj is the number of tied data in the j th group.
    5. For an alevel two-tailed test, H 0 is rejected and H 1 {} is accepted if Z rs £- Z 1- a /2 or if Z rs ³ Z 1- a /2
    6. For a one-tailed alevel test of H 0 versus the H 1 that the values from series 1 tend to exceed those from series 2, H 0 is rejected and H 1 {} is accepted if Z rs ³ Z 1- a .
    7. For a one-tailed alevel test of H 0 versus the H 1 that the values from series 2 tend to exceed those from series 1, H 0 is rejected and H 1 {} is accepted if Z rs £ Z 1- a .
    This test is almost similar to the Mann-Whitney-U test.
    Reference:
  • Statistical Methods for Environmental Pollution Monitoring, R.O. Gilbert, 1987, John Wiley & Sons Inc..

    Wilcoxon Signed Rank Test

    The Wilcoxon signed rank test can be used instead of the sign test if the underlying distribution is symmetric, though it need not to be a normal distribution. To perform the Wilcoxon Signed Rank Test a series may not contain missing values. Hymosconsiders the following hypothesis:
    H0 : The populations from which the two series have been drawn have the same mean,
    H1 : The populations have different means.
    The following procedure is followed:
    1. Compute for each pair of values the difference.
    2. Rank the absolute differences, assign rank 1 to the smallest value, rank two to the next, .., and the rank m to the largest value. If several data have the same value, assign them the midrank, that is, the average of the ranks that would otherwise be assigned to those data.
    3. Sum the positive ranks T+ and the absolute value of the negative ranks T-.
    4. Select the smallest value for T+ and T-.
    5. Compute

    where:


    Reference:
  • Statistical Methods for Environmental Pollution Monitoring, R.O. Gilbert, 1987, John Wiley & Sons Inc..
  • Methodiek voor de evaluatie en optimalisatie van routine waterkwaliteitsmeetnetten, Deel II: Overzicht van technieken en methoden, Stowa, 1998 (16)

    Seasonal Kendall Slope Estimator and Test

    The seasonal Kendall test is a trend test to be used when seasonal cycles are present in the data. The test may even be used when missing data or tied data are present in the series. The validity of the test does not depend on the data being normally distributed. The test consists of computing the Mann-Kendall test statistic S and its variance VAR(S), separately for each season. These seasonal statistics are then summed, and a Z statistic is computed. The normal distribution is used to test for a statistically significant trend.
    The Mann-Kendall statistic S is computed for each season with:

    where:
    n = number of years
    l, k = years
    x = data values
    i = season number
    If xil > xik then the sign = 1, if xil < xik then the sign = -1 else the sign = 0
    The Variance of S is computed for each season as follows:

    where:
    g i = the number of groups of tied data in season i
    t ip = the number of tied data in p th group of season i
    h i = the number of sampling times in season i that contain multiple data
    u ip = the number of multiple data in the q th time period of season i
    When_S_ i and VAR(S i ) are computed, they are summed across the K seasons:

    Finally the Z statistic is computed from:

    Hymosconsiders the following hypothesis:
    H0 : The populations from which the series has been drawn has no trend,
    H1 : The populations has a trend.
    Together with the seasonal Kendall test, the seasonal Kendall slope estimator is computed. The N i individual slope estimates for the i th season is computed form:

    This is done for each season. The individual slopes are ranked and the median values are computed for each season., this is the seasonal Kendall slope estimator. A confidence interval around the true slope is obtained by using the normal distribution, and:

    The lower and upper confidence limits are the M 1 th largest and the (M 1 +1) th largest values of the N' ordered slope estimates.
    Reference:
  • Statistical Methods for Environmental Pollution Monitoring, R.O. Gilbert, 1987, John Wiley & Sons Inc..
  • Methodiek voor de evaluatie en optimalisatie van routine waterkwaliteitsmeetnetten, Deel II: Overzicht van technieken en methoden, Stowa, 1998 (16)

    Dixon's test for outliers

    The Dixon's test is testing the time series on outliers. This test can be used for small samples, from 3 values up to 25 values. From the series of data values missing values are first removed before the series is ordered according to magnitude. The ratio of the difference of an extreme value from one of it's neighbour values in the ordered series is then calculated, using a formula that varies with sample size. This ratio is then compared to a tabulated value and, if found greater, the extreme value is considered an outlier.
    The confidence limits used are restricted to a finite set of values, i.e. 30%, 20%, 10%, 5%, 2%, 1% and 0.5%. Also the number of input values is restricted to a range of 3 to 25.
    HYMOS uses the following formulas for computing the critical values for the smallest and highest values of the ordered series:

    Under the null-hypothesis, the extreme values of the ordered series are not considered to be outliers.
    The table for the test statistic is taken from the statistical tests book of Gopal Kanji.
    Reference:
  • Gopal K. Kanji, 100 Statistical Tests, Sage Publications Ltd., 1999
  • ME.A.McBean and F.A.Rovers, Statistical procedures for analysis of environmental monitoring data and risk assessment, Prentice Hall PTR, 1998

    Lettenmaier rank correlation test

    This test is almost similar to the Spearman rank trend test, there is added a correction for the number of effective measurements. The computation procedure is also almost similar. The observations on the series is ranked from 1 to N. Tied observations are assigned the average of the tied ranks. The sum of squares of rank differences is calculated. Then the Spearman rank correlation coefficient rs is computed:
    rs = 1 - 6D/(N3 - N)
    where:
    D = sum of ranks
    N = number of measurements
    The statistic used to measure the significance of rs is:
    ½t½ = ½rs ½Ö{(N' - 2)/(1 - rs 2 )}
    where:
    rs = Spearman rank correlation coefficient
    N' = number of effective measurements
    N' is computed from:

    where:
    r= Lag-1 auto-correlation coefficient
    Under the null-hypothesis of zero correlation the theoretical value of the test statistic has a Student t -distribution with Ndf = N'-2 degrees of freedom for N' > 10.
    For the interpretation of the test results reference is made to the section about Hypothesis testing.
    Test condition:
    N³10
    Reference:
  • Methodiek voor de evaluatie en optimalisatie van routine waterkwaliteits meetnetten, Deel II: Overzicht van technieken en methoden, Stowa, 1998 (16)
  • No labels