Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


...

What

nameofinstance.xml

Description

Configuration for the new version of the transformation module

schema location

http

https://

fews

fewsdocs.

wldelft

deltares.nl/schemas/version1.0/transformationModule.xsd

Entry in ModuleDescriptors

<moduleDescriptor id="TransformationModule">
<description>Transformation Module</description>
<className>nl.wldelft.fews.system.plugin.transformationmodule.TransformationModule</className>
</moduleDescriptor>

Contents

Table of Contents
Children Display

...

TransformHBV_Inputs 1.00 default.xml.

TransformHBV_Inputs

File name for the TransformHBV_Inputs configuration.

1.00

Version number

default

Flag to indicate the version is the default configuration (otherwise omitted).

The configuration for the transformation module consists of two parts: transformation configuration files in the Config/ModuleConfigFiles directory and coefficient set configuration files in the Config/CoefficientSetsFiles directory.

...

The validation rules are optional in the configuration and can be used to define the outputflag and the custom flagsource of the output value based on the number of missing values/unreliables values and/or the number of doubtful values in the used input values per aggregation timestep. The available output flags are reliable, doubtful and missing.

...

Below the configuration of the basic example which was described above.

Code Block
languagexml
				<validationRule>
					<inputMissingPercentage>10</inputMissingPercentage>
					<outputValueFlag>reliable</outputValueFlag>
				</validationRule>
				<validationRule>
					<inputMissingPercentage>100</inputMissingPercentage>
					<outputValueFlag>missing</outputValueFlag>
				</validationRule>

...

Below shows a configuration example in which the rules above are implemented.

Code Block
languagexml
<validationRule>
  <inputMissingPercentage>15</inputMissingPercentage>
  <outputValueFlag>reliable</outputValueFlag>
</validationRule>
<validationRule>
  <inputMissingPercentage>40</inputMissingPercentage>
  <outputValueFlag>doubtful</outputValueFlag>
</validationRule>
<validationRule>
  <inputMissingPercentage>100</inputMissingPercentage>
  <outputValueFlag>missing</outputValueFlag>
</validationRule>

...

Below an example in which the output is reliable when there are no missing values in the input and when the percentage if missing values is less than 15%. However in the first case the output doesn't get a custom flagsource assigned while in the second case the output gets a custom flagsource assigned which is visible in the GUI to indicate that a output value was calculated but that missing values were found in the input.

Code Block
languagexml
<validationRule>
  <inputMissingPercentage>0</inputMissingPercentage>
  <outputValueFlag>reliable</outputValueFlag>
</validationRule>
<validationRule>
  <inputMissingPercentage>15</inputMissingPercentage>
  <outputValueFlag>reliable</outputValueFlag>
  <outputCustomFlagSourceId>CA</outputCustomFlagSourceId>
</validationRule>
<validationRule>
  <inputMissingPercentage>40</inputMissingPercentage>
  <outputValueFlag>doubtful</outputValueFlag>
</validationRule>
<validationRule>
  <inputMissingPercentage>100</inputMissingPercentage>
  <outputValueFlag>missing</outputValueFlag>
</validationRule>

...

Below a configuration example

Code Block
languagexml
<validationRule>
  <inputDoubtfulPercentage>10</inputDoubtfulPercentage>
  <inputMissingPercentage>0</inputMissingPercentage>
  <outputValueFlag>reliable</outputValueFlag>
</validationRule>
<validationRule>
  <inputDoubtfulPercentage>30</inputDoubtfulPercentage>
  <inputMissingPercentage>0</inputMissingPercentage>
  <outputValueFlag>doubtful</outputValueFlag>
  <outputCustomFlagSourceId>D1</outputCustomFlagSourceId>
</validationRule>
<validationRule>
  <inputDoubtfulPercentage>60</inputDoubtfulPercentage>
  <inputMissingPercentage>0</inputMissingPercentage>
  <outputValueFlag>doubtful</outputValueFlag>
  <outputCustomFlagSourceId>D2</outputCustomFlagSourceId>
</validationRule>
<validationRule>
  <inputDoubtfulPercentage>100</inputDoubtfulPercentage>
  <inputMissingPercentage>0</inputMissingPercentage>
  <outputValueFlag>doubtful</outputValueFlag>
  <outputCustomFlagSourceId>D3</outputCustomFlagSourceId>
</validationRule>
<validationRule>
  <inputMissingPercentage>15</inputMissingPercentage>
  <outputValueFlag>reliable</outputValueFlag>
  <outputCustomFlagSourceId>CA</outputCustomFlagSourceId>
</validationRule>
<validationRule>
  <inputMissingPercentage>40</inputMissingPercentage>
  <outputValueFlag>doubtful</outputValueFlag>
</validationRule>
<validationRule>
  <inputMissingPercentage>100</inputMissingPercentage>
  <outputValueFlag>missing</outputValueFlag>
</validationRule>

...

In the examples above the inputMissingValuePercentage and the inputDoubtfulPercentage was configured hard-coded in the configuration file. However it is also possible to make a reference to an attribute of a location. To reference to an attribute the referenced attribute should be placed within @.

Code Block
languagexml
<inputMissingPercentage>@MV@</inputMissingPercentage>

...

To explain the concept of the validation rules more the table below shows the input time series and the output time series of an aggregation accumulative tranformation which uses the validation rules which are shown above in the last eexample.


 

1-1-

Time

Input value

input flag

Output value

output flag

custom flagsource

1-1-2012 00:15

 

 

 

 

 






1-1-2012 00:30

1

 

 

 

 





1-1-2012 00:45

1

 

 

 

 





1-1-2012 01:00

1

 


3

doubtful

-

1-1-2012 01:15

 

 

 

 

 






1-1-2012 01:30

1

 

 

 

 





1-1-2012 01:45

 

 

 

 

 






1-1-2012 02:00

1

 


NaN

-

-

1-1-2012 02:15

1

 

 

 

 





1-1-2012 02:30

1

doubtful

 

 

 




1-1-2012 02:45

1

 

 

 

 





1-1-

1-

2012 03:00

1

 


4

doubtful

D1

1-1-2012 03:15

1

 

 

 

 





1-1-2012 03:30

1

 

 

 

 





1-1-2012 03:45

1

 

 

 

 





1-1-2012 04:00

1

 


4

reliable

 

...



The first output value is set to doubtful. Because in this case the total percentage of missing values is 25%. Which means that the following rule is applied.

Code Block
languagexml
<validationRule>
  <inputMissingPercentage>40</inputMissingPercentage>
  <outputValueFlag>doubtful</outputValueFlag>
</validationRule>

 


The second output value is a missing value because in this case the percentage of missing values is equal to 50%. This means that in this case the following rule will be appplied.

Code Block
languagexml
<validationRule>
  <inputMissingPercentage>100</inputMissingPercentage>
  <outputValueFlag>missing</outputValueFlag>
</validationRule>

 


The third output value is set to doubtful. The input doesn't contain missing values but has a single doubtful input value. The percentage of doubtful values in the input is therefore 25% which means that the following rule will be applied.

Code Block
languagexml
<validationRule>
  <inputDoubtfulPercentage>30</inputDoubtfulPercentage>
  <inputMissingPercentage>0</inputMissingPercentage>
  <outputValueFlag>doubtful</outputValueFlag>
  <outputCustomFlagSourceId>D1</outputCustomFlagSourceId>
</validationRule>

...

Since FEWS 2017.02 it is possible to configure if manual edits should be preserved. This setting applies to all transformations that are configured. The default is false. For an example configuration see:

Code Block
languagexml
<transformationModule xmlns="http://www.wldelft.nl/fews" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                 xsi:schemaLocation="http://www.wldelft.nl/fews httphttps://fewsfewsdocs.wldelftdeltares.nl/schemas/version1.0/transformationModule.xsd" version="1.0">
   <preserveManualEdits>true</preserveManualEdits>

 

Run transformations for a set of selected locations

In some cases it is usefull to run a transformation only for a specific set of locations. For example when the entire workflow has already ran and there is only a change at a specific location. This situation can occur, for example, when a water level has been editted or when the configuration is changed.

In this case the workflow can skip the calculations for the unchanged locations. The main benefit of this approach is that it saves a lot of processing time.

This functionality is now available in FEWS.However it is important to understand that this functionality cannot be used in all workflows. The functionality can be applied for transformations only. It cannot be used for running models or secondary validations. When a transformation is started for a location selection than the transformation will only start when the location of one of the input time series is selected. When a transformation has created output for a location which was not selected by the user than this location whill be added to the selection.

It is possible to run a workflow for a selected set of locations from the IFD, the task dialog and the manual forecast dialog. By default workflows cannot be run for a selected set of locations. To enable this the option allowSelection should be set to true in the workflowdescriptor of the workflow. Below an example.

...

<workflowDescriptor id="FillRelations" forecast="false" visible="true"autoApprove="false">
  <description>Met deze taak worden de gaten groter dan 2 uur gevuld dmv. relaties.</description>
  <allowSelection>true</allowSelection>
  <schedulingAllowed>true</schedulingAllowed>
</workflowDescriptor>

 Description

Since 2019.02 an optional field description is available. This field can be configured per transformation, and the text will be shown in the workflow tree asmouse ower label (tooltip). This can be used with any type of transformation.

Example:

Code Block
languagexml
linenumberstrue
	<transformation id="merge">
		<merge>
			<simple>
				<inputVariable>
					<variableId>Wiski</variableId>
				</inputVariable>
				<inputVariable>
					<variableId>Server</variableId>
				</inputVariable>
				<fillGapConstant>0</fillGapConstant>
				<outputVariable>
					<variableId>merge1</variableId>
				</outputVariable>
			</simple>
		</merge>
		<description>transformation description</description>
	</transformation>

Copying comments, flags and flag sources from input to output

<TODO>


Preventing previously calculated values to be overwritten with missings

Delft-FEWS processes data in moving windows compared to the timezero of the workflow. These workflows can be run several times per day. In certain conditions this could lead a transformation to calculate a missing value for a datetime for which earlier a correct value was calculated. This can lead to warnings such as:

Existing value overwritten by missing
Suspicious write action. Long time series written with only changes at the start and at the end. If this happens often this will explode the database.

This mechanism is illustrated with the image below, which shows a water level for which an average per day is calculated. The box shows the moving window (relativeViewPeriod). The image shows the daily values (red dots) calculated in the first run.

Image Added

In a later run (illustrated below), there are not sufficient values to calculate a daily value for 24 December. So this run will return a missing value for 24 December (while calculating a new value for 27 December.

Image Added


If these transformations would directly write to the same output timeserie (in this case with timeseriesType “external historical”), the later run would cause the average water level for December 24 to be overwritten with a missing value. The log would include warnings about this!

To prevent this, the output timeserie should be of timeserieType “temporary”, and next be merged (merge / simple transformation) with the final timeserie. This way, missing values will not overwrite previously calculated values.


Run transformations for a set of selected locations

In some cases it is usefull to run a transformation only for a specific set of locations. For example when the entire workflow has already ran and there is only a change at a specific location. This situation can occur, for example, when a water level has been editted or when the configuration is changed.

In this case the workflow can skip the calculations for the unchanged locations. The main benefit of this approach is that it saves a lot of processing time.

This functionality is now available in FEWS.However it is important to understand that this functionality cannot be used in all workflows. The functionality can be applied for transformations only. It cannot be used for running models or secondary validations. When a transformation is started for a location selection than the transformation will only start when the location of

 

When a node in the IFD is selected with a workflow which has the allowSelection option set true, the GUI will look like this:

Image Removed

In the property dialog below the tree with the nodes two selection boxes will appear.

The first checkbox will enable the option to run a workflow for a specific set of locations. The second checkbox will enable to run the workflow for specified period.

In the taskrun dialog an additinal checkbox will appear.

  Image Removed

 Which locations should the user select?

The transformation will run for the selected locations. If one the input timeseries is selected in the filters or in the map the transformation will run.

This means that the user should select the locations which are changed. This can be a change in the data or a change in the configuration.

This can be explained with the use of a simple example. Lets say we have a system which has a workflow which consists of a user simple function which estimates the water level at location B by simply copying the water level at location A to location B.

After the copying a set of statistical transformations are run to compute statistics.

The user edits the water level at location A and want to recompute the water level at location B. However the workflow which does this, is configured to do similar estimates at another 500 locations. In this case the user should select location A.

When the run starts the majority of the calculations are skipped except when the water level for location will be recalculated because in this case location A which is one of the input time series is selected. When this calculation is done, FEWS will

remember that location B is now also changed and will add location B to the list of selected locations. When the statistical transformations are started after the water level at location B is recomputed the statistics for location B are also recalculated because in this case the transformation which recomputed the statistics for location B has a input time series with location B and because location B was added to the list of selected locations, the statistical transformations which calculates statistics for location B will also be started.

This functionality cannot be used for spatial transformations. Before enabling this option for a workflow, the configurator should check if the workflow contains spatial transformations.

In addition to the above, this functionality can only be used for non-forecast workflows. Typically this functionality should be used for pre-processing of post-processing.

Therefore it is by default not possible to run a workflow for a specific location selection. This is only possible when in the workflowdescriptors the option allowSelection is to true. This option should only be set to true when the configurator has checked that the workflow is suitable for running for a specific location.

Steps to follow when implementing selection specific calculations

The following steps should be followed when this functionality is implemented.
1) Decide in which situations this functionality is needed
2) Make a list of the workflows which need to run in this type of situations
3) Ensure that the workflow only consists of transformations for which this functionality can be used.
4) Move transformations or other parts of the workflow which are suitable for this type of operations to another workflow
5) Set the option allowSelection to true in the workflow descriptor for the workflow which can be used for selection specific calculations
6) When the workflows will be started from the taskrun dialog of the manual forecast dialog no additional configuration is needed. These displays are available in almost every FEWS system. However when the IFD will be used for this. the following additional steps should be taken.

Implement selecion specific calculations for IFD

First step is to create a topology.xml to configure the content of the tree from which the workflows should be started.

Detailed informations about configuring the topology.xml can be found at 24 Topology

The following steps should be done when using the IFD for selection specific calculations.

  1. first create the tree structure by creating nodes in the topology.xml,
  2. add workflows to the nodes.
  3. add dependencies to the nodes by configuring the previous nodes
  4. by default leaf nodes will run locally and not at the server. This is not desired in this case Therefore the option localRun should be set to false for the leafnodes.

Below an example (part of the topology.xml)

a transformation has created output for a location which was not selected by the user than this location whill be added to the selection.

It is possible to run a workflow for a selected set of locations from the IFD, the task dialog and the manual forecast dialog. By default workflows cannot be run for a selected set of locations. To enable this the option allowSelection should be set to true in the workflowdescriptor of the workflow. Below an example.

Code Block
languagexml
<workflowDescriptor id="FillRelations" forecast="false" visible="true"autoApprove="false">
  <description>Met deze taak worden de gaten groter dan 2 uur gevuld dmv. relaties.</description>
  <allowSelection>true</allowSelection>
  <schedulingAllowed>true</schedulingAllowed>
</workflowDescriptor>


When a node in the IFD is selected with a workflow which has the allowSelection option set true, the GUI will look like this:

Image Added

In the property dialog below the tree with the nodes two selection boxes will appear.

The first checkbox will enable the option to run a workflow for a specific set of locations. The second checkbox will enable to run the workflow for specified period.

In the taskrun dialog an additinal checkbox will appear.

  Image Added

 Which locations should the user select?

The transformation will run for the selected locations. If one the input timeseries is selected in the filters or in the map the transformation will run.

This means that the user should select the locations which are changed. This can be a change in the data or a change in the configuration.

This can be explained with the use of a simple example. Lets say we have a system which has a workflow which consists of a user simple function which estimates the water level at location B by simply copying the water level at location A to location B.

After the copying a set of statistical transformations are run to compute statistics.

The user edits the water level at location A and want to recompute the water level at location B. However the workflow which does this, is configured to do similar estimates at another 500 locations. In this case the user should select location A.

When the run starts the majority of the calculations are skipped except when the water level for location will be recalculated because in this case location A which is one of the input time series is selected. When this calculation is done, FEWS will

remember that location B is now also changed and will add location B to the list of selected locations. When the statistical transformations are started after the water level at location B is recomputed the statistics for location B are also recalculated because in this case the transformation which recomputed the statistics for location B has a input time series with location B and because location B was added to the list of selected locations, the statistical transformations which calculates statistics for location B will also be started.

This functionality cannot be used for spatial transformations. Before enabling this option for a workflow, the configurator should check if the workflow contains spatial transformations.

In addition to the above, this functionality can only be used for non-forecast workflows. Typically this functionality should be used for pre-processing of post-processing.

Therefore it is by default not possible to run a workflow for a specific location selection. This is only possible when in the workflowdescriptors the option allowSelection is to true. This option should only be set to true when the configurator has checked that the workflow is suitable for running for a specific location.

Steps to follow when implementing selection specific calculations

The following steps should be followed when this functionality is implemented.
1) Decide in which situations this functionality is needed
2) Make a list of the workflows which need to run in this type of situations
3) Ensure that the workflow only consists of transformations for which this functionality can be used.
4) Move transformations or other parts of the workflow which are suitable for this type of operations to another workflow
5) Set the option allowSelection to true in the workflow descriptor for the workflow which can be used for selection specific calculations
6) When the workflows will be started from the taskrun dialog of the manual forecast dialog no additional configuration is needed. These displays are available in almost every FEWS system. However when the IFD will be used for this. the following additional steps should be taken.

Implement selecion specific calculations for IFD

First step is to create a topology.xml to configure the content of the tree from which the workflows should be started.

Detailed informations about configuring the topology.xml can be found at 24 Topology

The following steps should be done when using the IFD for selection specific calculations.

  1. first create the tree structure by creating nodes in the topology.xml,
  2. add workflows to the nodes.
  3. add dependencies to the nodes by configuring the previous nodes
  4. by default leaf nodes will run locally and not at the server. This is not desired in this case Therefore the option localRun should be set to false for the leafnodes.

Below an example (part of the topology.xml)

Code Block
xml
xml
<nodes id="HDSR"><nodes id="oppervlaktewaterstand"><relativePeriod unit="week" start="-52" end="0" />
  <node id="vul gaten kleiner dan 2 uur">
   <previousNodeId>secondary validatie</previousNodeId>
   <workflowId>FillGap2H_WerkOpvlWater</workflowId>
   <filterId>Fillgap</filterId>
Code Block
xmlxml
<nodes id="HDSR"><nodes id="oppervlaktewaterstand"><relativePeriod unit="week" start="-52" end="0" />
  <node id="vul gaten kleiner dan 2 uur">
   <previousNodeId>secondary validatie</previousNodeId>
   <workflowId>FillGap2H_WerkOpvlWater</workflowId>
   <filterId>Fillgap</filterId>
  <localRun>false</localRun>
 </node>
 <node id="vul gaten groter dan 2 uur">
  <previousNodeId>vul gaten kleiner dan 2 uur</previousNodeId>
  <workflowId>FillRelations</workflowId>
  <localRun>false</localRun>
 </node>
 <node id="berekening debieten">
  <previousNodeId>vul gaten groter dan 2 uur</previousNodeId>
  <workflowId>DebietBerekening</workflowId>
  <localRun>false</localRun>
 </node>
 <node id="sample werkfilterdata nonequi naar 15min">
  <previousNodeId>berekening debieten</previousNodeId>
  <workflowId>SampleRuwNaar15M</workflowId>
  <localRun>false</localRun>
 </node>
 <node id="langsprofielen berekenen">
  <previousNodeId>sample werkfilterdata nonequi naar 15min</previousNodeId>
  <workflowId>Langsprofiel</workflowId>
  <localRun>false</localRun>
 </node>
 <node id="aggregatie van kwartier naar uur">
  <previousNodeId>langsprofielen berekenen</previousNodeId>
  <workflowId>AggregeerWerkOpvlWater</workflowId>
  <localRun>false</localRun>
 </node>
 <node id="Peilbesluit evaluatie">
  <previousNodeId>aggregatie van kwartier naar uur</previousNodeId>
  <workflowId>PeilbesluitEvaluatie</workflowId>
  <localRun>false</localRun>
 </node>
 <node id="export LIZARD">
  <previousNodeId>Peilbesluit evaluatie</previousNodeId>
  <workflowId>ExportCIW</workflowId>
  <localRun>false</localRun>
 </node>
 <node id="export WIS-REPORTS">
  <previousNodeId>export LIZARD<vul gaten groter dan 2 uur">
  <previousNodeId>vul gaten kleiner dan 2 uur</previousNodeId>
  <workflowId>ExportCIW<<workflowId>FillRelations</workflowId>
  <localRun>false</localRun>
 </node>
 <node id="berekening debieten">
  <previousNodeId>vul gaten groter dan 2 uur</previousNodeId>
  <workflowId>DebietBerekening</workflowId>
  <localRun>false</localRun>
 </nodes>

second step is to add the following line the explorer.xml to add the IFD tool window to the system.

...

node>
 <node id="sample werkfilterdata nonequi naar 15min">
  <previousNodeId>berekening debieten</previousNodeId>
  <workflowId>SampleRuwNaar15M</workflowId>
  <localRun>false</localRun>
 </node>
 <node id="langsprofielen berekenen">
  <previousNodeId>sample werkfilterdata nonequi naar 15min</previousNodeId>
  <workflowId>Langsprofiel</workflowId>
  <localRun>false</localRun>
 </node>
 <node id="aggregatie van kwartier naar uur">
  <previousNodeId>langsprofielen berekenen</previousNodeId>
  <workflowId>AggregeerWerkOpvlWater</workflowId>
  <localRun>false</localRun>
 </node>
 <node id="Peilbesluit evaluatie">
  <previousNodeId>aggregatie van kwartier naar uur</previousNodeId>
  <workflowId>PeilbesluitEvaluatie</workflowId>
  <localRun>false</localRun>
 </node>
 <node id="export LIZARD">
  <previousNodeId>Peilbesluit evaluatie</previousNodeId>
  <workflowId>ExportCIW</workflowId>
  <localRun>false</localRun>
 </node>
 <node id="export WIS-REPORTS">
  <previousNodeId>export LIZARD</previousNodeId>
  <workflowId>ExportCIW</workflowId>
  <localRun>false</localRun>
 </node>
</nodes>

second step is to add the following line the explorer.xml to add the IFD tool window to the system.

Code Block
xml
xml
<explorerTask name="Forecasts">
  <predefinedDisplay>topology tree</predefinedDisplay>
  <toolbarTask>false</toolbarTask>
  <menubarTask>false</menubarTask>
  <toolWindow>true</toolWindow>
  <loadAtStartup>true</loadAtStartup>
</explorerTask>

Trim Output

A boolean option <trimOutput> is available within transformations. When true, missing values at the start and end of the output will be removed before writing the data to the database. This can prevent existing values to be overwritten with missings.

Forecast Loop 

For some transformations it is possible to define a forecast loop by configuring a <forecastLoopSearchPeriod>. This means that the transformation will be run for each forecast found within that period.

This will only work when the <inputVariable> and <outputVariable> are external forecasts. The output variable will get the same external forecast time as the input time series.

When this is configured in combination with a locationSet it will try to run the transformation for the maximum number of forecasts available for the locations. If for a location some forecasts are unavailable a warning will be logged and those transformation runs will be skipped for that forecast and location combination. 

List of all available transformations

For the most recent development version see the xsd schema at httphttps://fewsfewsdocs.wldelftdeltares.nl/schemas/version1.0/transformationTypes.xsd

Available since stable build 2014.01: