Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
h3. Contents of check for Mann-Kendall Check

The purpose of this check is to test for trends in each of the input time series. Once a trend is detected the check will alter the flags of the timeseries concerned to the specified output flag and the specified log message will be generated. One of the strengths of the Mann-Kendall check is that it can also be used when there are lots of missing values. If there are less than 10 non-missing values, the test will be skipped. During the check, the threshold criteria for the check are first sorted. The most severe log message is processed first and when the log message is generated the less serious log message will not be generated.

h3. General information on Mann-Kendall trend test

The MannKendall algorithm first calculates several statistics on the timeseries. Missing values are ignored.

*Mann-Kendall statistic S*
{latex}
\(S=\sum\limits_{k=1}^{N-1}\sum\limits_{l=k+1}^{N}sign(x_l-x_k)\)
{latex}

*Variance S*
{latex}
\(VAR(S)=\frac{1}{18}(n(n-1)(2n+5)-\sum\limits_{p=1}^{g}t_p(t_p-1)(2t_p+5)-\sum\limits_{q=1}^{h}u_q(u_q-1)(2u_q+5)+
\frac{\sum\limits_{p=1}^{g}t_p(t_p-1)(2t_p-2)-\sum\limits_{q=1}^{h}u_q(u_q-1)(2u_q-2)}{9n(n-1)(n-2)}+\frac{\sum\limits_{p=1}^{g}t_p(t_p-1)\sum\limits_{q=1}^{h}u_q(u_q-1)}{2n(n-1)})\)
{latex}

where 
{latex}g{latex} is the number of groups of tied data
{latex}t\_p{latex} the number of tied data in the p-th group
{latex}h{latex} the number of sampling times that contain multiple data
{latex}u\_p{latex} the number of multiple data in the qth time period


*Z statistic*
Z={latex}
\( \frac{S-1}{\sqrt{VAR(S)}}\)
{latex} , if S > 0,
{latex}Z=0{latex}, if S = 0 
{latex}
Z=\( \frac{S+1}{\sqrt{VAR(S)}}\)
{latex} , if S < 0,



h3. Conditions for rejecting H0 that there is no trend
After the statistics have been calculated and it has been established that there are at least 10 nonmissing values, the following conditions are used to trigger logmessages for the configured trend tests
* there is a two-tailed trend if zStatistic <= -delta or zStatistic >= delta
* there is a downward trend if zStatistic <= -delta
* there is an upward trend if zStatistic >= delta 

where delta is defined as 


* {latex}delta = inverseCDF(1 - confidenceCoefficient) if two-tailed{latex}

* {latex}delta = inverseCDF(1 - confidenceCoefficient / 2) if upward or downward{latex}


h3. Configuration

* *id*: identifier of the check.
* *checkRelativePeriod*: The period to run the trend test for.
* *variableDefinition*: Definition of time series. Each time series is processed independently. 
* *inputVariable*:Identifier of a variable of which timeseries the flags will be used as input (neighbouring locations). Refers to a time series set defined in the variableDefinitions. 
* *outputVariable*: Identifier of a variable for which timeseries the outputFlag has to be updated in case the thresholds are exceeded (observed values). Refers to a time series set defined in the variableDefinitions. 
* *logSensSlope*: Includes Sen's slope in the logging (default true). For large number of steps the algorithm for determining Sen's slope requires lots of memory. To resolve out of memory problems, set this option to false.

For each *threshold*,
* *testTrend*: either two-tailed, upward or downward (two-tailed is default)
* *logLevel*: Log level for the log message that is logged if a trend is detected. Can be DEBUG, INFO, WARN, ERROR or FATAL. If level is error or fatal, then the module will stop running after logging the first log message. Fatal should never be used actually.
* *outputFlag* Either unreliable or doubtful.
* *logEventCode*: Event code for the log message that is logged if a trend is detected. This event code has to contain a dot, e.g. "TimeSeries.Check", because the log message is only visible to the master controller if the event code contains a dot.
* *logMessage*: Log message that is logged if a trend is detected.

A threshold can specifybe specified by either a maximum drift or a confidenceCoefficient
* *confidenceCoefficient*: the confidence coefficient as used in the classical MannKendall check, also known as alpha, which is typically between 0 and 0.5, i.e. 0.05 (one-tailed) and 0.025 (two-tailed) correspond to a confidence level of 95%.
* *maximumDrift*: Instead of using the classical MannKendall test using the inverseCDF, this option will alter the flags and generate the logs when the maximum absolute drift is exceeded.  Drift is the length of the checkRelativePeriod times the slope.


|| Tag || Replacement ||
| %AMOUNT_CHANGED_FLAGS% | The amount of output flags that were changed. |
| %CHECK_ID% | The id of the check that caused the flags to be altered. |
| %HEADER%| Header name of the timeseries where the alterations took place. |
| %LOCATION_ID% | The locationId of the timeseries where the alterations took place. |
| %LOCATION_NAME% | The name of the locations where the alterations took place. |
| %NONE% | Hides the default tags that are automatically added. |
| %OUTPUT_FLAG% | The output flag |
| %PARAMETER_ID% | The parameterId of the timeseries where the alterations took place. |
| %PARAMETER_NAME% | The name of the parameter where the alterations took place. |
| %PERIOD% | The period boundaries in which the output flags were changed. |
| %SLOPE% | Sen's slope estimator, which is the median of all slopes of the non missing values. |
| %DRIFT% | Sen's slope estimator times the period for which there are missings. |


h3. Configuration examples for MannKendallCheck

A configuration example for the _MannKendallCheck_ is given below:
{code:xml}<secondaryValidation xmlns="http://www.wldelft.nl/fews" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.wldelft.nl/fews http://fews.wldelft.nl/schemas/version1.0/secondaryValidation.xsd">
	<mannKendallCheck id="MannKendallCheck1">
		<variable>
			<timeSeriesSet>
				<moduleInstanceId>MannKendallCheckTest</moduleInstanceId>
				<valueType>scalar</valueType>
				<parameterId>H.meting</parameterId>
				<locationId>Nue_0015_01_01</locationId>
				<timeSeriesType>simulated forecasting</timeSeriesType>
				<timeStep unit="hour" multiplier="1"/>
				<readWriteMode>read only</readWriteMode>
			</timeSeriesSet>
		</variable>
		<checkRelativePeriod unit="day" start="100" end="0"/>
		<threshold>
			<testTrend>two-tailed</testTrend>
			<confidenceCoefficient>0.01</confidenceCoefficient>
			<logLevel>WARN</logLevel>
			<logEventCode>SecondaryValidation.MannKendallCheck</logEventCode>
			<logMessage>trend detected in %HEADER% by %CHECK_ID%.</logMessage>
		</threshold>
	</mannKendallCheck>
</secondaryValidation>{code}

h3. Further reading

The algorithms from the Mann-Kendall check stem from pages 208 onwards [Statistical Methods for Environmental Pollution Monitoring by Richard O. Gilbert (PDF)|FEWSDOC:Mann-KendallCheck^205.pdf].