Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The concept of the validation rules was introduced as a solution for a common problem in operational situations when using aggregation transformations. When for example a yearly average was computed an aggregation was done over an entire year a single missing value in the input values would cause that the yearly average was also a missing value.

The validation rules provide a solution for these types of situations. It allows to configure in which cases an output value should be computed although the input contains missing values and/or doubtful values.

The validation rules are optional in the configuration and can be used to define the outputflag and the custom flagsource of the output value based on the number of missing values/unreliables values and/or the number of doubtfull values in the used input values. The available output flags are reliable, doubtful and missing.

With these rules it is possible to define for example that the output of the transformation is reliable if less than 10% of the input is unreliable and/or missing and that if this percentage is above 10% that in that case the output should be a missing value.

It is important to note that input values which are missing and input values which are marked as unreliable are treated the same. Both are seen as missing values by the validation rules.This prevents that a single missing value in the input will lead to a missing value in the aggregated output value.

Below the configuration of the basic example which was described above.

...

The configured validation rules are applied in the following way. The first validation rules is are applied first. In the example above the first rule is that if 10% or less of the input is missing (or unreliable) that the output flag will be set the to reliable. If the input doesn't meet the criteria for the first rule the transformation module will try to apply the second rule. In this case the second rule will always apply because a percentage of 100% is configured.

This Configuring a rule with a percentage of 100% is a recommended way of configuring the validation rules. By default if validation rules are configured and none of the configured rules are valid the output will be set to missing. Which means that in this case the second rule of 100% was not necessary because it is also the default hard-coded behaviour of the system.

But for the users of the system it is more understandable if the behaviour of the aggregation is configured instead of a hard-coded fallback mechanism in the software.

...

Below shows a configuration example if in which the rules above were are implemented.

Code Block
				<validationRule>
					<inputMissingPercentage>15</inputMissingPercentage>
					<outputValueFlag>reliable</outputValueFlag>
				</validationRule>
				<validationRule>
					<inputMissingPercentage>40</inputMissingPercentage>
					<outputValueFlag>doubtful</outputValueFlag>
				</validationRule>
				<validationRule>
					<inputMissingPercentage>100</inputMissingPercentage>
					<outputValueFlag>missing</outputValueFlag>
				</validationRule>

The example shows that in total 3 validation rules were needed. The first rule checks if less than 15% of the input is missing/unreliable. If this is not the case than it will be checked if the second rule can be applied. The second rule states that if less than 40% of the input is missing that in that case the output flag should be set to doubtfull. The last rule takes care of all the other situations. Note that it has a percentage configured of 100%. Which means that this rule will be applied. However because 2 rules are defined above this rule FEWS will always try to apply these rules first before applying this rule.

In some cases in it one would like to differ between situations in which for example the output was marked as reliablethe outputflag is the same. In the example above if all of the input values were reliable the output is marked as reliable. But if for example 10% of the input values were unreliable the output is also marked as reliable.

It would be nice if the user of the system would be able to see in the GUI of FEWS why the input was marked reliable. Were there missing values in the input or not? Is the output based on a few missing values?

To make this possible the concept of the custom flag source is was added to the validation rules. In addition to configuring an output flag it is also possible to configure a custom flag source. In the table of the Timeseriesdialog the custom flag source can be made visible by pressing ctrl + shift + j. This will make a new column in the table visible in which the custom flag source ids are shown. In the graph itself it also possible to make the custom flag sources visible by pressing ctrl + alt + v. To use the custom flagsources a file CustomFlagSources.xml should be added to the RegionConfig directory. In this file the custom flag sources should be defined. By configuring several rules which has the same outputflag but a different custom flagsource it is possible to make a difference between situations in which the outputflag is the same.

Below an example in which the output gets a custom flag source assigned is reliable when there are no missing values found in the input but and when the total percentage of if missing values is less than 15%. However in the first case the output doesn't get a custom flagsource assigned while in the second case the output gets a custom flagsource assigned which is visible in the GUI to indicate that a output value was calculated but that missing values were found in the input.

Code Block
	<validationRule>
					<inputMissingPercentage>0</inputMissingPercentage>
					<outputValueFlag>reliable</outputValueFlag>
				</validationRule>
				<validationRule>
					<inputMissingPercentage>15</inputMissingPercentage>
					<outputValueFlag>reliable</outputValueFlag>
					<outputCustomFlagSourceId>CA</outputCustomFlagSourceId>
				</validationRule>
				<validationRule>
					<inputMissingPercentage>40</inputMissingPercentage>
					<outputValueFlag>doubtful</outputValueFlag>
				</validationRule>
				<validationRule>
					<inputMissingPercentage>100</inputMissingPercentage>
					<outputValueFlag>missing</outputValueFlag>
	</validationRule>

Finally it is also possible to define validation rules based on the number of doubtful values in the input. It is even possible to define validation rules based on a combination of an allowed percentage of unreliable/missing values and doubtfull values. The sequence of applying the rules is also in this case the order in which the rules are configured. The first rule which applies to the
current situation is used.

Let's say for example that we also want rules to be defined for the doubtful input values. For example when only a small number of input values are doubtful we still want the output to be reliable. Otherwise we would like to have the output to be doubtfull but with an custom flag source which give us an indication of how many of the input values were doubtful.

...

Code Block
				<validationRule>
					<inputDoubtfulPercentage>10</inputDoubtfulPercentage>
					<inputMissingPercentage>0</inputMissingPercentage>
					<outputValueFlag>reliable</outputValueFlag>
				</validationRule>
				<validationRule>
					<inputDoubtfulPercentage>30</inputDoubtfulPercentage>
					<inputMissingPercentage>0</inputMissingPercentage>
					<outputValueFlag>doubtful</outputValueFlag>
					<outputCustomFlagSourceId>D1</outputCustomFlagSourceId>
				</validationRule>
				<validationRule>
					<inputDoubtfulPercentage>60</inputDoubtfulPercentage>
					<inputMissingPercentage>0</inputMissingPercentage>
					<outputValueFlag>doubtful</outputValueFlag>
					<outputCustomFlagSourceId>D2</outputCustomFlagSourceId>
				</validationRule>
				<validationRule>
					<inputDoubtfulPercentage>100</inputDoubtfulPercentage>
					<inputMissingPercentage>0</inputMissingPercentage>
					<outputValueFlag>doubtful</outputValueFlag>
					<outputCustomFlagSourceId>D3</outputCustomFlagSourceId>
				</validationRule>
				<validationRule>
					<inputMissingPercentage>0</inputMissingPercentage>
					<outputValueFlag>reliable</outputValueFlag>
				</validationRule>
				<validationRule>
					<inputMissingPercentage>15</inputMissingPercentage>
					<outputValueFlag>reliable</outputValueFlag>
					<outputCustomFlagSourceId>CA</outputCustomFlagSourceId>
				</validationRule>
				<validationRule>
					<inputMissingPercentage>40</inputMissingPercentage>
					<outputValueFlag>doubtful</outputValueFlag>
				</validationRule>
				<validationRule>
					<inputMissingPercentage>100</inputMissingPercentage>
					<outputValueFlag>missing</outputValueFlag>
				</validationRule>

The explanation above gave an good idea of the possibilities of the use of the validation rules.

To explain

Time

Input value

Output value

output flag

custom flagsource

1-1-2012 00:15

 

 

 

 

1-1-2012 00:30

 

 

 

 

1-1-2012 00:45

 

 

 

 

1-1-2012 01:00

 

 

 

 

1-1-2012 01:15

 

 

 

 

1-1-2012 01:30

 

 

 

 

1-1-2012 01:45

 

 

 

 

1-1-2012 02:00

 

 

 

 

1-1-2012 02:15

 

 

 

 

1-1-2012 02:30

 

 

 

 

1-1-2012 02:45

 

 

 

 

1-1-2012 03:00

 

 

 

 

1-1-2012 03:15

 

 

 

 

1-1-2012 03:30

 

 

 

 

1-1-2012 03:45

 

 

 

 

1-1-2012 04:00

 

 

 

 

Configuration example
Code Block
	<transformation id="aggregation accumulative">
		<aggregation>
			<accumulative>
				<inputVariable>
					<timeSeriesSet>
						<moduleInstanceId>ImportTelemetry</moduleInstanceId>
						<valueType>scalar</valueType>
						<parameterId>H.obs</parameterId>
						<locationSetId>hydgauges</locationSetId>
						<timeSeriesType>external historical</timeSeriesType>
						<timeStep unit="minute" multiplier="15"/>
						<relativeViewPeriod unit="day" startOverrulable="true" start="-7" end="0"/>
						<readWriteMode>read only</readWriteMode>
						<delay unit="minute" multiplier="0"/>
					</timeSeriesSet>
				</inputVariable>
				<outputVariable>
					<timeSeriesSet>
						<moduleInstanceId>Aggregate_Historic</moduleInstanceId>
						<valueType>scalar</valueType>
						<parameterId>accumulative</parameterId>
						<locationSetId>hydgauges</locationSetId>
						<timeSeriesType>external historical</timeSeriesType>
						<timeStep unit="hour" multiplier="1"/>
						<relativeViewPeriod unit="day" startOverrulable="true" start="-7" end="0"/>
						<readWriteMode>add originals</readWriteMode>
						<synchLevel>1</synchLevel>
					</timeSeriesSet>
				</outputVariable>
			</accumulative>
		</aggregation>
	</transformation>