You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 17 Next »

What

nameofinstance.xml

Description

Configuration for the Secondary Validation module

schema location

http://fews.wldelft.nl/schemas/version1.0/secondaryValidation.xsd

Entry in ModuleDescriptors

<moduleDescriptor id="SecondaryValidation">
	<description>SecondaryValidation</description>
	<className>nl.wldelft.fews.system.plugin.secondaryValidation.SecondaryValidation</className>
</moduleDescriptor>

SecondaryValidation (since 2010_01)

The SecondaryValidation module can be used to perform certain checks on time series data.

Configuration

An XML file for configuring an instance of the SecondaryValidation module called for example CheckImportedData could contain the following:

CheckImportedData 1.00 default.xml

CheckImportedData

File name for the CheckImportedData configuration.

1.00

Version number

default

Flag to indicate the version is the default configuration (otherwise omitted).

A SecondaryValidation configuration file is typically located in the moduleconfigfiles folder and can be used to configure one or more checks. The configured checks will be processed one by one in the specified order. Some checks can generate log messages, which can trigger actions in the master controller, like e.g. sending warning e-mails. Another check is available for automatically modifying flags to 'doubtful' or 'unreliable' per time step when a constraint on multiple time series fails.

Checks for generating log events

Some checks are intended for generating log events when a specific constraint is violated. The time series configured in these checks will be processed one by one. If a time series does not pass the check, then the configured log message is logged with the specified event code and level. The log event code can be used to trigger a certain action in the master controller, e.g. sending warning emails.

Four different types of these checks are available:

  • minNumberOfValuesCheck: Logs a message when there are not enough values within a configured period.
  • minNonMissingValuesCheck: Logs a message when there are not enough non-missing values within a configured period. A non-missing value is a value that is reliable, doubtful or unreliable.
  • minReliableOrDoubtfulValuesCheck: Logs a message when there are not enough values that are reliable or doubtful within a configured period.
  • minReliableValuesCheck: Logs a message when there are not enough reliable values within a configured period.
Check for setting flags per time step

The seriesComparisonCheck check is available for testing constraints between multiple time series or parameters per time step.
This check verifies constraints between multiple time series sets or multiple parameters and automatically modifies the flags per time step when the required input data was available (reliable or doubtful) and the specified constraint fails.

Variable Definitions

The configuration contains variable definitions for one or more time series that can be used as input for checks. Each variable definition contains a variableId and a timeSeriesSet. The variableId can be used to reference the time series in a check. Alternatively, depending on which check it is, either variable definitions or variables can be embedded in the checks.

Contents of checks for generating log events

The minNumberOfValuesCheck, minNonMissingValuesCheck, minReliableOrDoubtfulValuesCheck and minReliableValuesCheck all consist of the following elements:

  • id: Identifier of the check. This is only used in log messages and exception messages.
  • variable: One or more time series that need to be checked. This can be either an embedded timeSeriesSet or a reference to a variabledDefinition defined at the start of the configuration file. If this contains multiple time series (e.g. for multiple locations), then each time series is checked individually.
  • checkRelativePeriod: The check will only consider data in this time period. This time period is relative to the timeZero of the taskrun in which the module instance runs. The start and end of the period are included. This period overrules any relativeViewPeriods specified in the timeSeriesSets of the time series.
  • minNumberOfValues: The minimum required number of values in the time series to pass the check.
  • logLevel: Log level for the log message that is logged if a time series does not pass the check. Can be DEBUG, INFO, WARN, ERROR or FATAL. If level is error or fatal, then the module will stop running after logging the first log message.
  • logEventCode: Event code for the log message that is logged if a time series does not pass the check. This event code has to contain a dot, e.g. "TimeSeries.Check", because the log message is only visible to the master controller if the event code contains a dot.
  • logMessage: Log message that is logged if a time series does not pass the check. It is possible to use the tag %HEADER% in the logMessage. The %HEADER% tag will be replaced with the header of the time series.
Contents of check for setting flags per time step
  • id: identifier of the check.
  • variableDefinition: embedded variable definition (see above).
  • checkRelativePeriod: The check will only consider data in this time period. This time period is relative to the timeZero of the taskrun in which the module instance runs. The start and end of the period are included. This period overrules any relativeViewPeriods specified in the timeSeriesSets of the time series.
  • expression: A comparison between one or more variableIds.
  • validatingVariableId: One or more identifiers for variables for which the flags have to be modified.
  • outputFlag: New flag value for time steps for which there is valid data and the expression fails. Either doubtful or unreliable.

It is not possible to compare two different location sets both containing more than one location id, but the following options are possible:

  • one location with a scalar
  • all the locations in a location set with a scalar
  • two different locations
  • one location with all the locations in a location set
  • two similar locationSets, containing exactly the same location ids

Configuration example for checks for generating log events

<?xml version="1.0" encoding="UTF-8"?>
<secondaryValidation xmlns="http://www.wldelft.nl/fews" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://www.wldelft.nl/fews http://fews.wldelft.nl/schemas/version1.0/secondaryValidation.xsd">
          <variableDefinition>
		<variableId>input1</variableId>
		<timeSeriesSet>
			<moduleInstanceId>MinReliableValuesCheckTest</moduleInstanceId>
			<valueType>scalar</valueType>
			<parameterId>H.obs</parameterId>
			<locationId>location1</locationId>
			<timeSeriesType>external historical</timeSeriesType>
			<timeStep unit="minute" multiplier="15"/>
			<!-- any relativeViewPeriod here will always be overruled by checkRelativePeriod in each check -->
			<readWriteMode>read only</readWriteMode>
		</timeSeriesSet>
	</variableDefinition>
	<variableDefinition>
		<variableId>input2</variableId>
		<timeSeriesSet>
			<moduleInstanceId>MinReliableValuesCheckTest</moduleInstanceId>
			<valueType>scalar</valueType>
			<parameterId>H.obs</parameterId>
			<locationId>location2</locationId>
			<timeSeriesType>external historical</timeSeriesType>
			<timeStep unit="minute" multiplier="15"/>
			<!-- any relativeViewPeriod here will always be overruled by checkRelativePeriod in each check -->
			<readWriteMode>read only</readWriteMode>
		</timeSeriesSet>
	</variableDefinition>

	<minNonMissingValuesCheck id="MinNonMissingValuesCheck">
		<variable>
			<variableId>input1</variableId>
		</variable>
		<variable>
			<variableId>input2</variableId>
		</variable>
		<checkRelativePeriod unit="hour" start="-12" end="0"/>
		<minNumberOfValues>18</minNumberOfValues>
		<logLevel>INFO</logLevel>
		<logEventCode>TimeSeries.Check</logEventCode>
		<logMessage>Not enough values available for time series %header%</logMessage>
	</minNonMissingValuesCheck>

        <minNumberOfValuesCheck id="MinNumberOfValuesCheck">
		<variable>
			<variableId>input1</variableId>
		</variable>
		<variable>
			<variableId>input2</variableId>
		</variable>
		<checkRelativePeriod unit="hour" start="-12" end="0"/>
		<minNumberOfValues>24</minNumberOfValues>
		<logLevel>DEBUG</logLevel>
		<logEventCode>TimeSeries.Check</logEventCode>
		<logMessage>Not enough values available for time series %header%</logMessage>
	</minNumberOfValuesCheck>

        <minReliableOrDoubtfulValuesCheck id="MinReliableOrDoubtfulValuesCheck">
		<variable>
			<variableId>input1</variableId>
		</variable>
		<variable>
			<variableId>input2</variableId>
		</variable>
		<checkRelativePeriod unit="hour" start="-12" end="0"/>
		<minNumberOfValues>12</minNumberOfValues>
		<logLevel>WARN</logLevel>
		<logEventCode>TimeSeries.Check</logEventCode>
		<logMessage>Not enough values available for time series %header%</logMessage>
	</minReliableOrDoubtfulValuesCheck>

	<minReliableValuesCheck id="MinReliableValuesCheck">
		<variable>
			<variableId>input1</variableId>
		</variable>
		<variable>
			<variableId>input2</variableId>
		</variable>
		<checkRelativePeriod unit="hour" start="-12" end="0"/>
		<minNumberOfValues>6</minNumberOfValues>
		<logLevel>WARN</logLevel>
		<logEventCode>TimeSeries.Check</logEventCode>
		<logMessage>Not enough values available for time series %header%</logMessage>
	</minReliableValuesCheck>
</secondaryValidation>

Configuration examples for checks for setting flags per time step

The expression is always a comparison. The comparison operator is within XML is one of (.ne., .eq., .gt., .ge., .lt., .le.). Each variable has to be a single word without spaces. Mathematical symbols or functions like _e_, _pi_ or _cos_ cannot be used as variableId.

A simple configuration example for the seriesComparisonCheck is given below, it will check the values that are reliable or doubtful, and mark them as unreliable if they are smaller than thirteen:

<?xml version="1.0" encoding="UTF-8"?>
<secondaryValidation xmlns="http://www.wldelft.nl/fews" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.wldelft.nl/fews http://fews.wldelft.nl/schemas/version1.0/secondaryValidation.xsd">
    <!-- comparison between location variable and scalar, set to unreliable -->
	<seriesComparisonCheck id="checkWithScalar">
		<variableDefinition>
			<variableId>H_obs_location1</variableId>
			<timeSeriesSet>
				<moduleInstanceId>SeriesComparisonCheck</moduleInstanceId>
				<valueType>scalar</valueType>
				<parameterId>H.obs</parameterId>
				<locationId>location1</locationId>
				<timeSeriesType>external historical</timeSeriesType>
				<timeStep unit="minute" multiplier="15"/>
				<readWriteMode>read only</readWriteMode>
			</timeSeriesSet>
		</variableDefinition>
   		<checkRelativePeriod unit="hour" start="-12" end="0"/>
		<expression>H_obs_location1 .ge. 13</expression>
		<validatingVariableId>H_obs_location1</validatingVariableId>
		<outputFlag>unreliable</outputFlag>
    </seriesComparisonCheck>
</secondaryValidation>

A more complex sample does a comparison for different parameters in similar location sets, it will mark values that were reliable or doubtful,
when the difference between them is bigger than three:

<?xml version="1.0" encoding="UTF-8"?>
<secondaryValidation xmlns="http://www.wldelft.nl/fews" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.wldelft.nl/fews http://fews.wldelft.nl/schemas/version1.0/secondaryValidation.xsd">

<!-- comparison of variables with similar location sets, different parameters, does comparison per location  -->
	<seriesComparisonCheck id="similarLocationSetSeriesComparisonCheck">
		<!-- referred to by locationset1 and locationset2-->
		<variableDefinition>
			<variableId>H_obs1_location1</variableId>
			<timeSeriesSet>
				<moduleInstanceId>SeriesComparisonCheckTest</moduleInstanceId>
				<valueType>scalar</valueType>
				<parameterId>H.obs1</parameterId>
				<locationId>location1</locationId>
				<timeSeriesType>external historical</timeSeriesType>
				<timeStep unit="minute" multiplier="15"/>
				<readWriteMode>read only</readWriteMode>
			</timeSeriesSet>
		</variableDefinition>

		<!-- referred to by locationset1 and locationset2-->
		<variableDefinition>
			<variableId>H_obs1_location2</variableId>
			<timeSeriesSet>
				<moduleInstanceId>SeriesComparisonCheckTest</moduleInstanceId>
				<valueType>scalar</valueType>
				<parameterId>H.obs1</parameterId>
				<locationId>location2</locationId>
				<timeSeriesType>external historical</timeSeriesType>
				<timeStep unit="minute" multiplier="15"/>
				<readWriteMode>read only</readWriteMode>
			</timeSeriesSet>
		</variableDefinition>
		<!-- referred to by locationset1 and locationset2-->
		<variableDefinition>
			<variableId>H_obs2_location1</variableId>
			<timeSeriesSet>
				<moduleInstanceId>SeriesComparisonCheckTest</moduleInstanceId>
				<valueType>scalar</valueType>
				<parameterId>H.obs2</parameterId>
				<locationId>location1</locationId>
				<timeSeriesType>external historical</timeSeriesType>
				<timeStep unit="minute" multiplier="15"/>
				<readWriteMode>read only</readWriteMode>
			</timeSeriesSet>
		</variableDefinition>
		<!-- referred to by locationset1 and locationset2-->
		<variableDefinition>
			<variableId>H_obs2_location2</variableId>
			<timeSeriesSet>
				<moduleInstanceId>SeriesComparisonCheckTest</moduleInstanceId>
				<valueType>scalar</valueType>
				<parameterId>H.obs2</parameterId>
				<locationId>location2</locationId>
				<timeSeriesType>external historical</timeSeriesType>
				<timeStep unit="minute" multiplier="15"/>
				<readWriteMode>read only</readWriteMode>
			</timeSeriesSet>
		</variableDefinition>

		<variableDefinition>
			<variableId>locationSet1</variableId>
			<timeSeriesSet>
				<moduleInstanceId>SeriesComparisonCheckTest</moduleInstanceId>
				<valueType>scalar</valueType>
				<parameterId>H.obs</parameterId>
				<locationSetId>locationset1</locationSetId>
				<timeSeriesType>external historical</timeSeriesType>
				<timeStep unit="minute" multiplier="15"/>
				<readWriteMode>read only</readWriteMode>
			</timeSeriesSet>
		</variableDefinition>

		<variableDefinition>
			<variableId>locationSet2</variableId>
			<timeSeriesSet>
				<moduleInstanceId>SeriesComparisonCheckTest</moduleInstanceId>
				<valueType>scalar</valueType>
				<parameterId>H.obs</parameterId>
				<locationSetId>locationset2</locationSetId>
				<timeSeriesType>external historical</timeSeriesType>
				<timeStep unit="minute" multiplier="15"/>
				<readWriteMode>read only</readWriteMode>
			</timeSeriesSet>
		</variableDefinition>

		<checkRelativePeriod unit="hour" start="-12" end="0"/>
		<expression>abs(locationSet1 - locationSet2) .le. 3</expression>
		<validatingVariableId>locationSet1</validatingVariableId>
		<validatingVariableId>locationSet2</validatingVariableId>
		<outputFlag>unreliable</outputFlag>
	</seriesComparisonCheck>
</secondaryValidation>


  • No labels