Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0
Wiki Markup
{scrollbar}

h3. 

Contents

...

of

...

check

...

for

...

Mann-Kendall

...

Check

...

The

...

purpose

...

of

...

this

...

check

...

is

...

to

...

test

...

for

...

trends

...

in

...

each

...

of

...

the

...

input

...

time

...

series.

...

Once

...

a

...

trend

...

is

...

detected

...

the

...

check

...

will

...

alter

...

the

...

flags

...

of

...

the

...

timeseries

...

concerned

...

to

...

the

...

specified

...

output

...

flag

...

and

...

the

...

specified

...

log

...

message

...

will

...

be

...

generated.

...

One

...

of

...

the

...

strengths

...

of

...

the

...

Mann-Kendall

...

check

...

is

...

that

...

it

...

can

...

also

...

be

...

used

...

when

...

there

...

are

...

lots

...

of

...

missing

...

values.

...

If

...

there

...

are

...

less

...

than

...

10

...

non-missing

...

values,

...

the

...

test

...

will

...

be

...

skipped.

...

During

...

the

...

check,

...

the

...

threshold

...

criteria

...

for

...

the

...

check

...

are

...

first

...

sorted

...

so

...

that

...

the

...

most

...

severe

...

log

...

message

...

is

...

processed

...

first.

...

When

...

the

...

log

...

message

...

is

...

generated

...

the

...

less

...

serious

...

log

...

message

...

will

...

not

...

be

...

generated.

...

General

...

information

...

on

...

Mann-Kendall

...

trend

...

test

...

The

...

MannKendall

...

algorithm

...

first

...

calculates

...

several

...

statistics

...

on

...

the

...

timeseries.

...

Missing

...

values

...

are

...

ignored.

...

Mann-Kendall

...

statistic S

Wiki Markup
 S*

{latex}
\(S=\sum\limits_{k=1}^{N-1}\sum\limits_{l=k+1}^{N}sign(x_l-x_k)\)
{latex}


*

Variance S

Wiki Markup
 S*

{latex}
\(VAR(S)=\frac{1}{18}(n(n-1)(2n+5)-\sum\limits_{p=1}^{g}t_p(t_p-1)(2t_p+5)-\sum\limits_{q=1}^{h}u_q(u_q-1)(2u_q+5)+
\frac{\sum\limits_{p=1}^{g}t_p(t_p-1)(2t_p-2)-\sum\limits_{q=1}^{h}u_q(u_q-1)(2u_q-2)}{9n(n-1)(n-2)}+\frac{\sum\limits_{p=1}^{g}t_p(t_p-1)\sum\limits_{q=1}^{h}u_q(u_q-1)}{2n(n-1)})\)
{latex}

where 

where

Wiki Markup
{latex}g{latex}

...

is

...

the

...

number

...

of

...

groups

...

of

...

tied

...

data

Wiki Markup
{latex}t\_p{latex}

...

the

...

number

...

of

...

tied

...

data

...

in

...

the

...

p-th

...

group

Wiki Markup
{latex}h{latex}

...

the

...

number

...

of

...

sampling

...

times

...

that

...

contain

...

multiple

...

data

Wiki Markup
{latex}u\_p{latex}

...

the

...

number

...

of

...

multiple

...

data

...

in

...

the

...

qth

...

time

...

period

...

Z

...

statistic

...


Z=

Wiki Markup
{latex}
\( \frac{S-1}{\sqrt{VAR(S)}}\)
{latex}

...

,

...

if

...

S

...

> 0,

...


Wiki Markup
{latex}Z=0{latex}
,

...

if

...

S

...

=

...

0

Wiki Markup
{latex}
Z=\( \frac{S+1}{\sqrt{VAR(S)}}\)
{latex}

...

,

...

if

...

S

...

< 0,

...

Slope

In classical MannKendall tests Slope is the median of the slopes which is a very reliable estimator of the slope even when there are lots of missing values. The calculation of this value is expensive in terms of memory and requires N * (N-1)

...

/

...

2

...

memory

...

where

...

N

...

is

...

the

...

number

...

of

...

non-missing

...

input

...

values.

...

In

...

order

...

to

...

prevent

...

memory

...

problems,

...

it

...

is

...

recommended

...

that

...

a

...

limited

...

amount

...

of

...

input

...

values

...

is

...

used.

...

If

...

the

...

option

...

logSensSlope

...

is

...

set

...

to

...

false,

...

the

...

average

...

slope

...

is

...

used

...

to

...

estimate

...

the

...

slope

...

which

...

is

...

less

...

accurate

...

than

...

the

...

median

...

but

...

requires

...

N

...

memory.

...

Drift

Drift is the duration of the checkRelativePeriod times the estimated slope.

Conditions for rejecting H0 that there is no trend

After the statistics have been calculated and it has been established that there are at least 10 nonmissing values, the following conditions are used to trigger logmessages for the configured trend tests

  • there is a two-tailed trend if zStatistic <= -delta or zStatistic >= delta
  • there is a downward trend if zStatistic <= -delta
  • there is an upward trend if zStatistic >= delta

where delta is defined as

  • Wiki Markup
    {latex}delta = inverseCDF(1 - confidenceCoefficient) if two-tailed{latex}

...

  • Wiki Markup
    {latex}delta = inverseCDF(1 - confidenceCoefficient / 2) if upward or downward{latex}

Configuration

  • id: identifier of the check.
  • checkRelativePeriod: The period to run the trend test for.
  • variableDefinition: Definition of time series. Each time series is processed independently.
  • inputVariable:Identifier of a variable of which timeseries the flags will be used as input (neighbouring locations). Refers to a time series set defined in the variableDefinitions.
  • outputVariable: Identifier of a variable for which timeseries the outputFlag has to be updated in case the thresholds are exceeded (observed values). Refers to a time series set defined in the variableDefinitions.
  • logSensSlope: Includes Sen's slope in the logging (default true). For large number of steps the algorithm for determining Sen's slope requires lots of memory. To resolve out of memory problems, set this option to false. The difference is that no longer the median of the slopes is used to estimate the slope (classical mann kendall), but instead the average of the slopes is used as slope estimator.

For each threshold,

  • testTrend: either two-tailed, upward or downward (two-tailed is default)
  • logLevel: Log level for the log message that is logged if a trend is detected. Can be DEBUG, INFO, WARN, ERROR or FATAL. If level is error or fatal, then the module will stop running after logging the first log message. Fatal should never be used actually.
  • outputFlag Either unreliable or doubtful.
  • outputMode By default the flags that need updating are updated and log events are generated for the updated flags. When this option is set to 'logs_only', the log events are generated but the flags will not be updated.
  • logEventCode: Event code for the log message that is logged if a trend is detected. This event code has to contain a dot, e.g. "TimeSeries.Check", because the log message is only visible to the master controller if the event code contains a dot.
  • logMessage: Log message that is logged if a trend is detected.

A threshold can be specified by either a maximum drift or a confidenceCoefficient

  • confidenceCoefficient: the confidence coefficient as used in the classical MannKendall check, also known as alpha, which is typically between 0 and 0.5, i.e. 0.05 (one-tailed) and 0.025 (two-tailed) correspond to a confidence level of 95%.
  • maximumDrift: Instead of using the classical MannKendall test using the inverseCDF, this option will alter the flags and generate the logs when the maximum absolute drift is exceeded. Drift is the length of the checkRelativePeriod times the slope.

Tag

Replacement

%AMOUNT_CHANGED_FLAGS%

The amount of output flags that were changed.

%CHECK_ID%

The id of the check that caused the flags to be altered.

%HEADER%

Header name of the timeseries where the alterations took place.

%LOCATION_ID%

The locationId of the timeseries where the alterations took place.

%LOCATION_NAME%

The name of the locations where the alterations took place.

%NONE%

Hides the default tags that are automatically added.

%OUTPUT_FLAG%

The output flag

%PARAMETER_ID%

The parameterId of the timeseries where the alterations took place.

%PARAMETER_NAME%

The name of the parameter where the alterations took place.

%PERIOD%

The period boundaries in which the output flags were changed.

%SLOPE%

Sen's slope estimator, which is the median of all slopes of the non missing values.

%DRIFT%

Sen's slope estimator times the duration of checkRelativePeriod.

Configuration examples for MannKendallCheck

A configuration example for the MannKendallCheck is given below:

Code Block
xml
xml



h3. Configuration

* *id*: identifier of the check.
* *checkRelativePeriod*: The period to run the trend test for.
* *variableDefinition*: Definition of time series. Each time series is processed independently. 
* *inputVariable*:Identifier of a variable of which timeseries the flags will be used as input (neighbouring locations). Refers to a time series set defined in the variableDefinitions. 
* *outputVariable*: Identifier of a variable for which timeseries the outputFlag has to be updated in case the thresholds are exceeded (observed values). Refers to a time series set defined in the variableDefinitions. 
* *logSensSlope*: Includes Sen's slope in the logging (default true). For large number of steps the algorithm for determining Sen's slope requires lots of memory. To resolve out of memory problems, set this option to false. The difference is that no longer the median of the slopes is used to estimate the slope (classical mann kendall), but instead the average of the slopes is used as slope estimator.

For each *threshold*,
* *testTrend*: either two-tailed, upward or downward (two-tailed is default)
* *logLevel*: Log level for the log message that is logged if a trend is detected. Can be DEBUG, INFO, WARN, ERROR or FATAL. If level is error or fatal, then the module will stop running after logging the first log message. Fatal should never be used actually.
* *outputFlag* Either unreliable or doubtful.
* *outputMode* By default the flags that need updating are updated and log events are generated for the updated flags. When this option is set to 'logs_only', the log events are generated but the flags will not be updated. 
* *logEventCode*: Event code for the log message that is logged if a trend is detected. This event code has to contain a dot, e.g. "TimeSeries.Check", because the log message is only visible to the master controller if the event code contains a dot.
* *logMessage*: Log message that is logged if a trend is detected.

A threshold can be specified by either a maximum drift or a confidenceCoefficient
* *confidenceCoefficient*: the confidence coefficient as used in the classical MannKendall check, also known as alpha, which is typically between 0 and 0.5, i.e. 0.05 (one-tailed) and 0.025 (two-tailed) correspond to a confidence level of 95%.
* *maximumDrift*: Instead of using the classical MannKendall test using the inverseCDF, this option will alter the flags and generate the logs when the maximum absolute drift is exceeded.  Drift is the length of the checkRelativePeriod times the slope.


|| Tag || Replacement ||
| %AMOUNT_CHANGED_FLAGS% | The amount of output flags that were changed. |
| %CHECK_ID% | The id of the check that caused the flags to be altered. |
| %HEADER%| Header name of the timeseries where the alterations took place. |
| %LOCATION_ID% | The locationId of the timeseries where the alterations took place. |
| %LOCATION_NAME% | The name of the locations where the alterations took place. |
| %NONE% | Hides the default tags that are automatically added. |
| %OUTPUT_FLAG% | The output flag |
| %PARAMETER_ID% | The parameterId of the timeseries where the alterations took place. |
| %PARAMETER_NAME% | The name of the parameter where the alterations took place. |
| %PERIOD% | The period boundaries in which the output flags were changed. |
| %SLOPE% | Sen's slope estimator, which is the median of all slopes of the non missing values. |
| %DRIFT% | Sen's slope estimator times the duration of checkRelativePeriod. |


h3. Configuration examples for MannKendallCheck

A configuration example for the _MannKendallCheck_ is given below:
{code:xml}<secondaryValidation xmlns="http://www.wldelft.nl/fews" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.wldelft.nl/fews http://fews.wldelft.nl/schemas/version1.0/secondaryValidation.xsd">
        <mannKendallCheck id="MannKendallCheck_nue_015_01">
	<variableDefinition>
		<variableId>input1</variableId>
			<timeSeriesSet>
				<moduleInstanceId>MannKendallCheckTest</moduleInstanceId>
				<valueType>scalar</valueType>
				<parameterId>H.meting</parameterId>
				<locationId>Nue_0015_01_01</locationId>
				<timeSeriesType>simulated forecasting</timeSeriesType>
				<timeStep unit="minute" multiplier="1"/>
				<relativeViewPeriod unit="minute" start="1705491" end="1705501"/>
				<readWriteMode>read only</readWriteMode>
			</timeSeriesSet>
		</variableDefinition>
		<input>
			<variableId>input1</variableId>
		</input>
		<!-- test storage is set to 2009-1-1, data starts at 11-Dec-2011 -->
		<checkRelativePeriod unit="minute" start="1705491" end="1705501"/>
		<threshold>
			<testTrend>two-tailed</testTrend>
			<confidenceCoefficient>0.01</confidenceCoefficient>
			<outputFlag>unreliable</outputFlag>
			<logLevel>WARN</logLevel>
			<logEventCode>SecondaryValidation.MannKendallCheck</logEventCode>
			<logMessage>Two-tailed Mann-Kendall check has detected a trend in %HEADER% by %CHECK_ID%, %AMOUNT_CHANGED_FLAGS% flags set to %OUTPUT_FLAG%.</logMessage>
		</threshold>
	</mannKendallCheck>
</secondaryValidation>
{code}

h3. Further reading

The algorithms from the 

Further reading

The algorithms from the Mann-Kendall

...

check

...

stem

...

from

...

pages

...

208

...

onwards

...

Statistical

...

Methods

...

for

...

Environmental

...

Pollution

...

Monitoring

...

by

...

Richard

...

O.

...

Gilbert

...

(PDF)

...

.