What |
---|
nameofinstance.xml | |
Description | Configuration for the new version of the transformation module |
---|---|
schema location |
Entry in ModuleDescriptors
<moduleDescriptor id="Transformation">
<description>General Transformation Component</description>
<className>nl.wldelft.fews.system.plugin.transformation.TransformationController</className>
</moduleDescriptor>
Contents
Table of Contents |
---|
Children Display |
---|
Anchor | ||||
---|---|---|---|---|
|
Anchor | ||||
---|---|---|---|---|
|
The Transformation module is a general-purpose module that allows for generic transformation and manipulation of time series data. The module may be configured to provide for simple arithmetic manipulation, time interval transformation, shifting the series in time etc, as well as for applying specific hydro-meteorological transformation transformations such as stage discharge relationships etc.
The Transformation module allows for the manipulation and transformation of one or more time series. The utility may be configured to provide for;
- Manipulation of one or more series using a standard library of arithmetic operators/functions (enumerated);
- Addition, subtraction, division, multiplication
- Power function, exponential function
- Hydro-meteorological functions like:
- Deriving discharges from stages
- Compute potential evaporation
- Calculating weighted catchment average rainfall
- Shifting series in time
- Time interval conversion:
- Aggregation
- Dis-aggregation
- Converting non-equidistant to equidistant series
- Creating astronomical tide series from harmonic components
- Handling of typical profiles
- Data hierarchy
- Selection of (tidal) peaks
- statistics
When available as configuration on the file system, the name of the XML file for configuring an instance of the transformation module called for example TransformHBV_Inputs may be:
TransformHBV_Inputs 1.00 default.xml
TransformHBV_Inputs | File name for the TransformHBV_Inputs configuration. |
1.00 | Version number |
default | Flag to indicate the version is the default configuration (otherwise omitted). |
...
transformationSet
Root element for the definition of a transformation (processing an input to an output). . Multiple entries may exist.
Attributes;
- transofrmationId : Id of the transformation defined. Used for reference purposes only. This Id will be included in log messages generated
Figure 58 Elements of the definition of an input variable.
inputVariable
Definition of the input variables to be used in transformation. This may either be a time series set, a typical profile or a set of (harmonic) components. The InputVariable is assigned an ID. This ID is used later in the transformation functions as a reference to the data.
Attributes;
- variableId : ID of the variable (group).Later used in referencing the variable.
- variableType : Optional type definition of variable (defaults to "any")
- convertDatum : Optional Boolean flag to indicate if datum is to be converted.
Available harmonic components are listed in the attached file.
timeSerieSet
Definition of an input variable as a time series set (see TimeSeriesSet definition).
timeStep
Time step for typical profile if variable to be defined as typical profile.
Attributes;
- unit (enumeration of: second, minute, hour, day, week, nonequidistant)
- multiplier defines the number of units given above in a time step (not relevant for nonequidistant time steps)**
- divider same function as the multiplier, but defines fraction of units in time step.**
relativeViewPeriod
Relative view period of the typical profile to create. If this is defined and the time span indicated is longer than the typical profile data provided, then the profile data will be repeated until the required time span is filled. If the optional element is not provided then the typical profile data will be used only once.
data
Data entered to define the typical profile. Data can be entered in different ways. The typical profile can be defined as a series of values at the requested time step, inserted at the start of the series, or it can be mapped to specific time values (e.g. setting a profile value to hold at 03:15 of every day). Which of these is used depends on the attributes defined.
Attributes;
- value : Required value for each step in the profile
Wiki Markup *monthDay* : Attribute value indicating the value entered is valid for a month/day combination. The year value is added depending on the year value in which it is used. The string has the format "\--\[month\]-\[day\]". For example the 23^rd\^ of August is "--08-23".
Wiki Markup *dateTime* : Attribute value indicating the value entered is valid for a specific date time combination. The string has the format "\[year\]-\[month\]-\[day\]T\[hour\]:\[minute\]:\[second\]". For example the 23^rd\^ of August is "1984-12-31T00:00:00".
Wiki Markup *time* : Attribute value indicating the value entered is valid for a specific time, irrespective of the date. The date value is added run time. The string has the format "\[hour\]:\[minute\]:\[second\]". For example "01:15:00". \\
timeZone
Optional specification of the time zone for the data entered (see timeZone specification).
timeZone:timeZoneOffset
The offset of the time zone with reference to UTC (equivalent to GMT). Entries should define the number of hours (or fraction of hours) offset. (e.g. +01:00)
timeZone:timeZoneName
Enumeration of supported time zones. See appendix B for list of supported time zones.
arithmeticFunction
Root element for defining a transformation as an arithmetic function (see next section for details).
hydroMeteoFunction
Root element for defining one of the available hydro-meteorological transformations.
ruleBasedTransformation
Root element for defining a rule based transformation (see next section for details on rules).
Attributes;
- rule : definition of aggregation approach. Enumeration of;
- selectpeakvalues
- selectlowvalues
- selectpeakvalueswithincertaingap
- selectlowvalueswithincertaingap
- equitononequidistant
- equitononequidistantforinstantaneousseries
- equitononequidistantforaccumulativeseries
- datahierarchy
- typicalprofiletotimeseries
- zerodegreealtitudelevel
- datatotimeseries
aggregate
Root element for defining a time aggregation transformation (rules are discussed below)
Attributes;
- rule : definition of aggregation approach. Enumeration of;
- instantaneous
- accumulative
- mean
- constant
disaggregate
Root element for defining a time dis-aggregation transformation (rules are discussed below)
Attributes;
- rule: definition of disaggregation approach. Enumeration of;
- instantaneous
- accumulative
- disaggregateusingweights
- constant
nonequidistantToEquidistant
Root element for defining transformation of an non-equidistant time series to an equidistant time series. (rules are discussed below)
Attributes;
- rule: definition of approach. Enumeration of;
- zero
- missing
- linearinterpolated
- equaltolast
Statistics
Root element for defining statistical transformations.
Season: the statistics transformation can also be carried out for a specific season which is defined by a start and end date. If multiple seasons are specified, then the statistics transformation will be carried out separately for each specified season. A warning will be given when seasons overlap in time.
- startMonthDay: defines start time of season "--mm-dd"
- endMonthDay: defines end time of season "--mm-dd"
- timeZone
Function:
- available functions *
- max
- min
- sum
- count
- mean
- median
- standardDeviation
- percentileExceedence
- percentileNonExceedence
- quartile
- skewness
- kurtosis
- variance
- rsquared
- isBlockFunction:* *if true, the statistical parameters are calculated for each time window defined by the time step of the output time series, e.g. time step year leads to yearly statistical parameters. If false and output time series time step is set to nonequidistant, the statistical parameters are calculated for the relative view period (one value for the whole period) or for the individual season if applied.
- inputVariableId
- outputVariableId
- value: if function percentileExceedence or percentileNonExceedence is chosen, the desired percentile has to be defined, e.g. 75-th percentile => value="75"
- ignoreMissing: if true, all missings of the input time series are not taken into account in the statistical calculation.
- seasonal: this option is only relevant when using seasons. If true (default), then one result value per season per year is returned. If false, then for each season only one (combined) result value is returned. For example when seasonal is false, the month January is specified as a season, the input time series contains data for a period of ten years and the function max is specified, then the result will be the maximum of all values in January in all ten years. Note: if a specific season (e.g. January 2006) is not fully contained within the input time series, then this specific season is not used in the calculations. For example if the month January is specified as a season and the input time series contains only data from 15 January 2006 to 1 March 2008, then only January 2007 and January 2008 will be used in the calculations. In this case January 2006 will not be used in the calculations.
...
An improvement version of the FEWS Transformation Module is currently under construction. The new version is much more easy to configure than the old version. The new version uses a new schema for configuration, also several new transformations are added.
Configuration
When available as configuration on the file system, the name of an XML file for configuring an instance of the transformation module called for example TransformHBV_Inputs may be:
TransformHBV_Inputs 1.00 default.xml.
TransformHBV_Inputs | File name for the TransformHBV_Inputs configuration. |
1.00 | Version number |
default | Flag to indicate the version is the default configuration (otherwise omitted). |
The configuration for the transformation module consists of two parts: transformation configuration files in the Config/ModuleConfigFiles directory and coefficient set configuration files in the Config/CoefficientSetsFiles directory.
In a transformation configuration file one or more transformations can be configured. Some transformations require coefficient sets in which given coefficients are defined. For a given transformation that requires a coefficient set there are different ways of defining the coefficient set in the configuration. One way is to specify an embedded coefficient set in the transformation configuration itself. Another way is to put a reference in the transformation configuration. This reference consists of the name of a separate coefficient set configuration file and the id of a coefficient set in that file.
Both the transformations and coefficient sets can be configured to be time dependent. This can be used for instance to define a given coefficient value to be 3 from 1 January 2008 to 1 January 2009, and to be 4 from 1 January 2009 onwards. This can be done by defining multiple periodCoefficientSets, each one with a different period, as in the following xml example.
Code Block | ||||
---|---|---|---|---|
| ||||
<periodCoefficientSet>
<period>
<startDateTime date="2008-01-01" time="00:00:00"/>
<endDateTime date="2009-01-01" time="00:00:00"/>
</period>
<structure>
<pumpFixedDischarge>
<discharge>3</discharge>
</pumpFixedDischarge>
</structure>
</periodCoefficientSet>
<periodCoefficientSet>
<period>
<validAfterDateTime date="2009-01-01"/>
</period>
<structure>
<pumpFixedDischarge>
<discharge>4</discharge>
</pumpFixedDischarge>
</structure>
</periodCoefficientSet>
|
If a date is specified without a time, then the time is assumed to be 00:00:00, so <validAfterDateTime date="2009-01-01"/> is the same as <validAfterDateTime date="2009-01-01" time="00:00:00"/>. To specify dates and times in a particular time zone use the optional time zone element at the beginning of a transformations or a coefficient sets configuration file, e.g. <timeZone>GMT+5:00</timeZone>. Then all dates and times in that configuration file are in the defined time zone. If no time zone is defined, then dates and times are in GMT. Note: 2008-06-20 11:33:00 in time zone GMT+5:00 is physically the same time as 2008-06-20 06:33:00 in GMT.
If for a given transformation there are different coefficientSets configured for different periods in time, then the following rule is used. The start of a period is always inclusive. The end of a period is exclusive if another period follows without a gap in between, otherwise the end of the period is inclusive. If for example there are three periodCoefficientSets defined (A, B and C), each with a different period, as in the following xml example. Then at 2002-01-01 00:00:00 periodCoefficientSet A is valid. At 2003-01-01 00:00:00 periodCoefficientSet B is valid since the start of the period is inclusive. At 2004-01-01 00:00:00 periodCoefficientSet B is still valid, since there is a gap after 2004-01-01 00:00:00. At 2011-01-01 00:00:00 periodCoefficientSet C is valid, since no other periods follow (the period of C is the last period in time that is defined). This same rule applies to time-dependent transformations.
Code Block | ||||
---|---|---|---|---|
| ||||
<periodCoefficientSet>
<!-- periodCoefficientSet A -->
<period>
<startDateTime date="2002-01-01" time="00:00:00"/>
<endDateTime date="2003-01-01" time="00:00:00"/>
</period>
...
</periodCoefficientSet>
<periodCoefficientSet>
<!-- periodCoefficientSet B -->
<period>
<startDateTime date="2003-01-01" time="00:00:00"/>
<endDateTime date="2004-01-01" time="00:00:00"/>
</period>
...
</periodCoefficientSet>
<periodCoefficientSet>
<!-- periodCoefficientSet C -->
<period>
<startDateTime date="2010-01-01" time="00:00:00"/>
<endDateTime date="2011-01-01" time="00:00:00"/>
</period>
...
</periodCoefficientSet>
|
Validation rules
The concept of the validation rules was introduced as a solution for a common problem in operational situations when using aggregation transformations. When for example an aggregation was done over an entire year a single missing value in the input values would cause that the yearly average was also a missing value.
The validation rules provide a solution for these types of situations. It allows to configure in which cases an output value should be computed although the input contains missing values and/or doubtful values.
The validation rules are optional in the configuration and can be used to define the outputflag and the custom flagsource of the output value based on the number of missing values/unreliables values and/or the number of doubtful values in the used input values per aggregation timestep. The available output flags are reliable, doubtful and missing.
With these rules it is possible to define for example that the output of the transformation is reliable if less than 10% of the input is unreliable and/or missing and that if this percentage is above 10% that in that case the output should be a missing value.
It is important to note that input values which are missing and input values which are marked as unreliable are treated the same. Both are seen as missing values by the validation rules.
Below the configuration of the basic example which was described above.
Code Block | ||
---|---|---|
| ||
<validationRule>
<inputMissingPercentage>10</inputMissingPercentage>
<outputValueFlag>reliable</outputValueFlag>
</validationRule>
<validationRule>
<inputMissingPercentage>100</inputMissingPercentage>
<outputValueFlag>missing</outputValueFlag>
</validationRule> |
The configured validation rules are applied in the following way. The first validation rules are applied first. In the example above the first rule is that if 10% or less of the input is missing (or unreliable) that the output flag will be set to reliable. If the input doesn't meet the criteria for the first rule the transformation module will try to apply the second rule. In this case the second rule will always apply because a percentage of 100% is configured.
Configuring a rule with a percentage of 100% is a recommended way of configuring the validation rules. By default if validation rules are configured and none of the configured rules are valid the output will be set to missing. Which means that in this case the second rule of 100% was not necessary because it is also the default hard-coded behaviour of the system.
But for the users of the system it is more understandable if the behaviour of the aggregation is configured instead of a hard-coded fallback mechanism in the software.
To explain the validation rules a bit more a more difficult example will explained. Let's say that we would like configure our aggregation in such a way that the following rules are applied:
1 if the percentage of missing and/or unreliable values is less than 15% the output should be reliable.
2 if the percentage of missing values is less than 40% the output should be doubtful.
3 in all other cases the output should be a missing value.
Below shows a configuration example in which the rules above are implemented.
Code Block | ||
---|---|---|
| ||
<validationRule>
<inputMissingPercentage>15</inputMissingPercentage>
<outputValueFlag>reliable</outputValueFlag>
</validationRule>
<validationRule>
<inputMissingPercentage>40</inputMissingPercentage>
<outputValueFlag>doubtful</outputValueFlag>
</validationRule>
<validationRule>
<inputMissingPercentage>100</inputMissingPercentage>
<outputValueFlag>missing</outputValueFlag>
</validationRule>
|
The example shows that in total 3 validation rules were needed. The first rule checks if less than 15% of the input is missing/unreliable. If this is not the case than it will be checked if the second rule can be applied. The second rule states that if less than 40% of the input is missing that in that case the output flag should be set to doubtful. The last rule takes care of all the other situations. Note that it has a percentage configured of 100%. Which means that this rule will be applied. However because 2 rules are defined above this rule FEWS will always try to apply these rules first before applying this rule.
In some cases one would like to differ between situations in which the outputflag is the same. In the example above if all of the input values were reliable the output is marked as reliable. But if for example 10% of the input values were unreliable the output is also marked as reliable.
It would be nice if the user of the system would be able to see in the GUI of FEWS why the input was marked reliable. Were there missing values in the input or not? Is the output based on a few missing values?
To make this possible the concept of the custom flag source was added to the validation rules. In addition to configuring an output flag it is also possible to configure a custom flag source. In the table of the Timeseriesdialog the custom flag source can be made visible by pressing ctrl + shift + j. This will make a new column in the table visible in which the custom flag source ids are shown. In the graph itself it also possible to make the custom flag sources visible by pressing ctrl + alt + v. To use the custom flagsources a file CustomFlagSources.xml should be added to the RegionConfig directory. For details see 27 CustomFlagSources In this file the custom flag sources should be defined. By configuring several rules which has the same outputflag but a different custom flagsource it is possible to make a difference between situations in which the outputflag is the same.
Below an example in which the output is reliable when there are no missing values in the input and when the percentage if missing values is less than 15%. However in the first case the output doesn't get a custom flagsource assigned while in the second case the output gets a custom flagsource assigned which is visible in the GUI to indicate that a output value was calculated but that missing values were found in the input.
Code Block | ||
---|---|---|
| ||
<validationRule>
<inputMissingPercentage>0</inputMissingPercentage>
<outputValueFlag>reliable</outputValueFlag>
</validationRule>
<validationRule>
<inputMissingPercentage>15</inputMissingPercentage>
<outputValueFlag>reliable</outputValueFlag>
<outputCustomFlagSourceId>CA</outputCustomFlagSourceId>
</validationRule>
<validationRule>
<inputMissingPercentage>40</inputMissingPercentage>
<outputValueFlag>doubtful</outputValueFlag>
</validationRule>
<validationRule>
<inputMissingPercentage>100</inputMissingPercentage>
<outputValueFlag>missing</outputValueFlag>
</validationRule>
|
Finally it is also possible to define validation rules based on the number of doubtful values in the input. It is important to note that missing values and unrealiable values found in the input are also counted as doubtful values. It is even possible to define validation rules based on a combination of an allowed percentage of unreliable/missing values and doubtful values. The sequence of applying the rules is also in this case the order in which the rules are configured. The first rule which applies to the
current situation is used.
Let's say for example that we also want rules to be defined for the doubtful input values. For example when only a small number of input values are doubtful we still want the output to be reliable. Otherwise we would like to have the output to be doubtful but with an custom flag source which give us an indication of how many of the input values were doubtful.
Below a configuration example
Code Block | ||
---|---|---|
| ||
<validationRule>
<inputDoubtfulPercentage>10</inputDoubtfulPercentage>
<inputMissingPercentage>0</inputMissingPercentage>
<outputValueFlag>reliable</outputValueFlag>
</validationRule>
<validationRule>
<inputDoubtfulPercentage>30</inputDoubtfulPercentage>
<inputMissingPercentage>0</inputMissingPercentage>
<outputValueFlag>doubtful</outputValueFlag>
<outputCustomFlagSourceId>D1</outputCustomFlagSourceId>
</validationRule>
<validationRule>
<inputDoubtfulPercentage>60</inputDoubtfulPercentage>
<inputMissingPercentage>0</inputMissingPercentage>
<outputValueFlag>doubtful</outputValueFlag>
<outputCustomFlagSourceId>D2</outputCustomFlagSourceId>
</validationRule>
<validationRule>
<inputDoubtfulPercentage>100</inputDoubtfulPercentage>
<inputMissingPercentage>0</inputMissingPercentage>
<outputValueFlag>doubtful</outputValueFlag>
<outputCustomFlagSourceId>D3</outputCustomFlagSourceId>
</validationRule>
<validationRule>
<inputMissingPercentage>15</inputMissingPercentage>
<outputValueFlag>reliable</outputValueFlag>
<outputCustomFlagSourceId>CA</outputCustomFlagSourceId>
</validationRule>
<validationRule>
<inputMissingPercentage>40</inputMissingPercentage>
<outputValueFlag>doubtful</outputValueFlag>
</validationRule>
<validationRule>
<inputMissingPercentage>100</inputMissingPercentage>
<outputValueFlag>missing</outputValueFlag>
</validationRule> |
The explanation above gave an good idea of the possibilities of the use of the validation rules.
In the examples above the inputMissingValuePercentage and the inputDoubtfulPercentage was configured hard-coded in the configuration file. However it is also possible to make a reference to an attribute of a location. To reference to an attribute the referenced attribute should be placed within @.
Code Block | ||
---|---|---|
| ||
<inputMissingPercentage>@MV@</inputMissingPercentage>
|
For example to reference the attribute MV for the inputMissingValuePercentage the configuration should be like.
To explain the concept of the validation rules more the table below shows the input time series and the output time series of an aggregation accumulative tranformation which uses the validation rules which are shown above in the last eexample.
Time | Input value | input flag | Output value | output flag | custom flagsource |
---|---|---|---|---|---|
1-1-2012 00:15 | |||||
1-1-2012 00:30 | 1 | ||||
1-1-2012 00:45 | 1 | ||||
1-1-2012 01:00 | 1 | 3 | doubtful | - | |
1-1-2012 01:15 | |||||
1-1-2012 01:30 | 1 | ||||
1-1-2012 01:45 | |||||
1-1-2012 02:00 | 1 | NaN | - | - | |
1-1-2012 02:15 | 1 | ||||
1-1-2012 02:30 | 1 | doubtful | |||
1-1-2012 02:45 | 1 | ||||
1-1-2012 03:00 | 1 | 4 | doubtful | D1 | |
1-1-2012 03:15 | 1 | ||||
1-1-2012 03:30 | 1 | ||||
1-1-2012 03:45 | 1 | ||||
1-1-2012 04:00 | 1 | 4 | reliable |
The first output value is set to doubtful. Because in this case the total percentage of missing values is 25%. Which means that the following rule is applied.
Code Block | ||
---|---|---|
| ||
<validationRule>
<inputMissingPercentage>40</inputMissingPercentage>
<outputValueFlag>doubtful</outputValueFlag>
</validationRule>
|
The second output value is a missing value because in this case the percentage of missing values is equal to 50%. This means that in this case the following rule will be appplied.
Code Block | ||
---|---|---|
| ||
<validationRule>
<inputMissingPercentage>100</inputMissingPercentage>
<outputValueFlag>missing</outputValueFlag>
</validationRule>
|
The third output value is set to doubtful. The input doesn't contain missing values but has a single doubtful input value. The percentage of doubtful values in the input is therefore 25% which means that the following rule will be applied.
Code Block | ||
---|---|---|
| ||
<validationRule>
<inputDoubtfulPercentage>30</inputDoubtfulPercentage>
<inputMissingPercentage>0</inputMissingPercentage>
<outputValueFlag>doubtful</outputValueFlag>
<outputCustomFlagSourceId>D1</outputCustomFlagSourceId>
</validationRule>
|
the fourth and last output value has set to reliable with no output flag. In this case all in the input values are reliable. In this case the first rule
is applied (in the case when all of the input values are reliable the first rule always applies).
Manual Edits
Since FEWS 2017.02 it is possible to configure if manual edits should be preserved. This setting applies to all transformations that are configured. The default is false. For an example configuration see:
Code Block | ||
---|---|---|
| ||
<transformationModule xmlns="http://www.wldelft.nl/fews" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.wldelft.nl/fews https://fewsdocs.deltares.nl/schemas/version1.0/transformationModule.xsd" version="1.0">
<preserveManualEdits>true</preserveManualEdits> |
Description
Since 2019.02 an optional field description is available. This field can be configured per transformation, and the text will be shown in the workflow tree asmouse ower label (tooltip). This can be used with any type of transformation.
Example:
Code Block | ||||
---|---|---|---|---|
| ||||
<transformation id="merge">
<merge>
<simple>
<inputVariable>
<variableId>Wiski</variableId>
</inputVariable>
<inputVariable>
<variableId>Server</variableId>
</inputVariable>
<fillGapConstant>0</fillGapConstant>
<outputVariable>
<variableId>merge1</variableId>
</outputVariable>
</simple>
</merge>
<description>transformation description</description>
</transformation> |
Copying comments, flags and flag sources from input to output
<TODO>
Preventing previously calculated values to be overwritten with missings
Delft-FEWS processes data in moving windows compared to the timezero of the workflow. These workflows can be run several times per day. In certain conditions this could lead a transformation to calculate a missing value for a datetime for which earlier a correct value was calculated. This can lead to warnings such as:
Existing value overwritten by missing
Suspicious write action. Long time series written with only changes at the start and at the end. If this happens often this will explode the database.
This mechanism is illustrated with the image below, which shows a water level for which an average per day is calculated. The box shows the moving window (relativeViewPeriod). The image shows the daily values (red dots) calculated in the first run.
In a later run (illustrated below), there are not sufficient values to calculate a daily value for 24 December. So this run will return a missing value for 24 December (while calculating a new value for 27 December.
If these transformations would directly write to the same output timeserie (in this case with timeseriesType “external historical”), the later run would cause the average water level for December 24 to be overwritten with a missing value. The log would include warnings about this!
To prevent this, the output timeserie should be of timeserieType “temporary”, and next be merged (merge / simple transformation) with the final timeserie. This way, missing values will not overwrite previously calculated values.
Run transformations for a set of selected locations
In some cases it is usefull to run a transformation only for a specific set of locations. For example when the entire workflow has already ran and there is only a change at a specific location. This situation can occur, for example, when a water level has been editted or when the configuration is changed.
In this case the workflow can skip the calculations for the unchanged locations. The main benefit of this approach is that it saves a lot of processing time.
This functionality is now available in FEWS.However it is important to understand that this functionality cannot be used in all workflows. The functionality can be applied for transformations only. It cannot be used for running models or secondary validations. When a transformation is started for a location selection than the transformation will only start when the location of one of the input time series is selected. When a transformation has created output for a location which was not selected by the user than this location whill be added to the selection.
It is possible to run a workflow for a selected set of locations from the IFD, the task dialog and the manual forecast dialog. By default workflows cannot be run for a selected set of locations. To enable this the option allowSelection should be set to true in the workflowdescriptor of the workflow. Below an example.
Code Block | ||
---|---|---|
| ||
<workflowDescriptor id="FillRelations" forecast="false" visible="true"autoApprove="false">
<description>Met deze taak worden de gaten groter dan 2 uur gevuld dmv. relaties.</description>
<allowSelection>true</allowSelection>
<schedulingAllowed>true</schedulingAllowed>
</workflowDescriptor>
|
When a node in the IFD is selected with a workflow which has the allowSelection option set true, the GUI will look like this:
In the property dialog below the tree with the nodes two selection boxes will appear.
The first checkbox will enable the option to run a workflow for a specific set of locations. The second checkbox will enable to run the workflow for specified period.
In the taskrun dialog an additinal checkbox will appear.
Which locations should the user select?
The transformation will run for the selected locations. If one the input timeseries is selected in the filters or in the map the transformation will run.
This means that the user should select the locations which are changed. This can be a change in the data or a change in the configuration.
This can be explained with the use of a simple example. Lets say we have a system which has a workflow which consists of a user simple function which estimates the water level at location B by simply copying the water level at location A to location B.
After the copying a set of statistical transformations are run to compute statistics.
The user edits the water level at location A and want to recompute the water level at location B. However the workflow which does this, is configured to do similar estimates at another 500 locations. In this case the user should select location A.
When the run starts the majority of the calculations are skipped except when the water level for location will be recalculated because in this case location A which is one of the input time series is selected. When this calculation is done, FEWS will
remember that location B is now also changed and will add location B to the list of selected locations. When the statistical transformations are started after the water level at location B is recomputed the statistics for location B are also recalculated because in this case the transformation which recomputed the statistics for location B has a input time series with location B and because location B was added to the list of selected locations, the statistical transformations which calculates statistics for location B will also be started.
This functionality cannot be used for spatial transformations. Before enabling this option for a workflow, the configurator should check if the workflow contains spatial transformations.
In addition to the above, this functionality can only be used for non-forecast workflows. Typically this functionality should be used for pre-processing of post-processing.
Therefore it is by default not possible to run a workflow for a specific location selection. This is only possible when in the workflowdescriptors the option allowSelection is to true. This option should only be set to true when the configurator has checked that the workflow is suitable for running for a specific location.
Steps to follow when implementing selection specific calculations
The following steps should be followed when this functionality is implemented.
1) Decide in which situations this functionality is needed
2) Make a list of the workflows which need to run in this type of situations
3) Ensure that the workflow only consists of transformations for which this functionality can be used.
4) Move transformations or other parts of the workflow which are suitable for this type of operations to another workflow
5) Set the option allowSelection to true in the workflow descriptor for the workflow which can be used for selection specific calculations
6) When the workflows will be started from the taskrun dialog of the manual forecast dialog no additional configuration is needed. These displays are available in almost every FEWS system. However when the IFD will be used for this. the following additional steps should be taken.
Implement selecion specific calculations for IFD
First step is to create a topology.xml to configure the content of the tree from which the workflows should be started.
Detailed informations about configuring the topology.xml can be found at 24 Topology
The following steps should be done when using the IFD for selection specific calculations.
- first create the tree structure by creating nodes in the topology.xml,
- add workflows to the nodes.
- add dependencies to the nodes by configuring the previous nodes
- by default leaf nodes will run locally and not at the server. This is not desired in this case Therefore the option localRun should be set to false for the leafnodes.
Below an example (part of the topology.xml)
Code Block | ||||
---|---|---|---|---|
| ||||
<nodes id="HDSR"><nodes id="oppervlaktewaterstand"><relativePeriod unit="week" start="-52" end="0" />
<node id="vul gaten kleiner dan 2 uur">
<previousNodeId>secondary validatie</previousNodeId>
<workflowId>FillGap2H_WerkOpvlWater</workflowId>
<filterId>Fillgap</filterId>
<localRun>false</localRun>
</node>
<node id="vul gaten groter dan 2 uur">
<previousNodeId>vul gaten kleiner dan 2 uur</previousNodeId>
<workflowId>FillRelations</workflowId>
<localRun>false</localRun>
</node>
<node id="berekening debieten">
<previousNodeId>vul gaten groter dan 2 uur</previousNodeId>
<workflowId>DebietBerekening</workflowId>
<localRun>false</localRun>
</node>
<node id="sample werkfilterdata nonequi naar 15min">
<previousNodeId>berekening debieten</previousNodeId>
<workflowId>SampleRuwNaar15M</workflowId>
<localRun>false</localRun>
</node>
<node id="langsprofielen berekenen">
<previousNodeId>sample werkfilterdata nonequi naar 15min</previousNodeId>
<workflowId>Langsprofiel</workflowId>
<localRun>false</localRun>
</node>
<node id="aggregatie van kwartier naar uur">
<previousNodeId>langsprofielen berekenen</previousNodeId>
<workflowId>AggregeerWerkOpvlWater</workflowId>
<localRun>false</localRun>
</node>
<node id="Peilbesluit evaluatie">
<previousNodeId>aggregatie van kwartier naar uur</previousNodeId>
<workflowId>PeilbesluitEvaluatie</workflowId>
<localRun>false</localRun>
</node>
<node id="export LIZARD">
<previousNodeId>Peilbesluit evaluatie</previousNodeId>
<workflowId>ExportCIW</workflowId>
<localRun>false</localRun>
</node>
<node id="export WIS-REPORTS">
<previousNodeId>export LIZARD</previousNodeId>
<workflowId>ExportCIW</workflowId>
<localRun>false</localRun>
</node>
</nodes>
|
second step is to add the following line the explorer.xml to add the IFD tool window to the system.
Code Block | ||||
---|---|---|---|---|
| ||||
<explorerTask name="Forecasts">
<predefinedDisplay>topology tree</predefinedDisplay>
<toolbarTask>false</toolbarTask>
<menubarTask>false</menubarTask>
<toolWindow>true</toolWindow>
<loadAtStartup>true</loadAtStartup>
</explorerTask>
|
Trim Output
Since 2024.02 the <trimOutput> element for transformations is an enumeration of options:
- true (trims start and end)
- false (trims neither start nor end)
- startOnly (trims only the start)
- endOnly (trims only the end)
Trimmed values of the output will be removed before writing the data to the database. This can prevent existing values to be overwritten with missings. This has been backported as far as 2023.02.
Prior to 2024.02:
A boolean option <trimOutput> is available within transformations. When true, missing values at the start and end of the output will be removed before writing the data to the database. This can prevent existing values to be overwritten with missings.
Forecast Loop
For some transformations it is possible to define a forecast loop by configuring a <forecastLoopSearchPeriod>. This means that the transformation will be run for each forecast found within that period.
This will only work when the <inputVariable> and <outputVariable> are external forecasts. The output variable will get the same external forecast time as the input time series.
When this is configured in combination with a locationSet it will try to run the transformation for the maximum number of forecasts available for the locations. If for a location some forecasts are unavailable a warning will be logged and those transformation runs will be skipped for that forecast and location combination.
List of all available transformations
For the most recent development version see the xsd schema at https://fewsdocs.deltares.nl/schemas/version1.0/transformationTypes.xsd
Available since stable build 2014.01:
...
Through definition of an arithmetic function, a user defined equation can be applied in transforming a set of input data to a set of output data. Any number of inputs may be defined, and used in the user defined function. Each input variable is identified by its Id, as this is used configuring the function. The function is written using general mathematical operators. A function parser is used in evaluating the functions (per time step) and returning results. These are again assigned to variables which can be linked to output time series through the variableId.
Rather than use a usedDefinedFunction, a special function can also be selected from a list of predefined hydroMeteoFunctions. When selected this will pose requirements on other settings.
Transformations may be applied in segments, with different functions or different parameters used for each segment. A segment is defined as being valid for a range of values, identified in one of the input variables (see example below).
Figure 59 Example of applying segments to a time series
Figure 60 Elements of the Arithmetic section of the transformation module configuration
segments
Root element for defining segments. When used this must include the input variable Id used to determine segments as an attribute.
Attributes;
- limitVariablId : Id of input variable used to test against segment limits.
segment
Root element for definition of a segment. At least one segment must be included.limitLower
Lower limit of the segment. Function defined will be applied at a given time step only if value at that time step in the variable defined as limitVariable is above or equal to this value.limitUpper
Upper limit of the segment. Function defined will be applied at a given time step only if value at that time step in the variable defined as limitVariable is below this value (below or equal only for the highest segment).
functionType
Element used only when defining a predefined hydroMeteoFunction. Depending on selected function, specific requirements will hold for defining input variables and parameters. If a special function is selected then the user defined function element is not defined; Enumeration of available options is (the most important are discussed below);
- simpleratingcurve ; for applying a simple power law rating curve.
- weigthtedaverage : special function for calculating weighted average of inputs. When a value in one of the inputs is missing, the remaining inputs will be used and the weights rescaled to unity.
- penman: for calculating evaporation using Penman
- penmannortheast: specific implementation of Penman formula
- qhrelationtable : allows application of a rating curve using a table.
- degreemanipulation
userDefinedFunction
Optional specification of a user defined function to be evaluated using the function parser. Only the function need be defined, without the equality sign. The function is defined as a string and may contain Id's of inputSeries, names of variables and constants defined, and mathematical operators
Operators offered
- scalar series: +, -, /, *, ^, sin, cos, tan, asin, acos, atan, sinh, cosh, tanh, asinh, acosh, atanh, log, ln, exp, sqrt, abs, pow, min, max, minSkipMissings, maxSkipMissings, sumSkipMissings, average
- operators for conversion of grid to scalar series: spatialMin, spatialMax, spatialSum, spatialSumSkipMissings, spatialAverage
h54 constant
Allows definition of a constant to be used in the function.
coefficient
Optional element to allow coefficients for use in the function to be defined. These coefficients are allocated and Id for later use in the function defined. For user defined functions specific coefficients need to be defined. Multiple entries may be defined.
Attributes;
- coefficientId : Id of the coefficient defined. This can be used in the function.
- coefficientType : identification of the coefficient. Applied in rule based configurations.
- value: value of the coefficient
tableColumnData
Definition of a table to use for transforming input variable to output variables.
Attributes;
- nrOfColumns: number of columns in table (should equal 2).
- variableIdCol1 Input variable associated with first column
- variableIdCol2 Output variable associated with first column
tableColumnData:data
Element containing data for each row in the table
Attributes;
- col1: value for column 1
- col2: value for column 2
outputVariable
Id of the output variable from the function. This may be saved to the database by associating the Id to an outputVariable.
flag
Optional element to force saving the result data for the segment with a given flag. This may be used for example to force data from a segment as doubtful. Enumeration is either "unreliable" or "doubtful". if data is reliable the element should not be included.
...
Stage discharge transformations can be defined using the simpleratingcurve option of the hydroMeteoFunctions. To apply this certain properties must be defined in each segment.
For stage-discharge transformation the requirements are;
- Coefficient values for coefficientId's "a", "b" and "c" must be defined.
- Rating curve formula is Q = a * (H+b) ^c
- Input variable Id must be "H"
- Output variable Id must be "Q".
- limitVariableId must be "H".
Example:
For stage-discharge transformation the requirements are;
- Coefficient values for coefficientId's "a", "b" and "c" must be defined.
- Input variable Id must be "Q"
- Output variable Id must be "H".
- limitVariableId must be "Q".
Example:
...
Catchment average rainfall can be determined by weighting input precipitation time series. The weightedavarege option of the hydroMeteoFunctions can be applied to include the option of recalculation of weights if one of the input locations is missing. To apply this certain properties must be defined in each segment.
For establishing catchment average precipitation the requirements are;
- functionType must be set to weightedavarege
- Weights are given as coefficient values with coefficientId's "a", "b" and "c" etc.
- Additional coefficients may be defined to allow for altitude correction.
Example:
...
This set of transformations allows temporal aggregation and disaggregation of time series. The time step defined in the input variable and the output variable determine the howthe time steps are migrated. The configuration need only define the rule followed in aggregation/disaggregation. Aggregation and disaggregation can only be used to transform between equidistant time steps. A nonequidistant series can be transformed to an equidistant series using the appropriate element (see above).
Aggregation rules;
- Instantaneous: apply instantaneous resampling- ie value at cardinal time step in output series is same as in input time series at that time step.
- accumulative : value in output time series is accumulated sum of values of time steps in input time series (use in for example aggregating rainfall in mm).
- mean value in output time series is mean of values of time steps in input time series (use in for example aggregating rain rate in mm/hr).
- constant
Disaggregation rules;
- Instantaneous: apply linear interpolation- ie value at cardinal time step in output series is same as in input time series at that time step. Values in between are interpolated.
- accumulative : value in output time series is derived as equal fraction of valuein input series. Fraction is determined using ration of time steps.
- Disaggregateusingweights value in output time series weighted fraction of input value. Weights are defined as coefficients. These are sub-elements to the disaggregation element. The number of coefficients defined should be equal to the disaggregation ration (i.e. 24 when disaggregating from day to hour). The coefficient Id's should be numbered 1 to n..
- constant value in output time series at intermediate time steps is equal to the last available value in the input time series.
Rules for mapping non-equidistant time series to equidistant time series - zero value in output time series is zero if time values do not coincide
- missing value in output time series is missing if time values do not coincide
- linearinterpolated value in output time series is interpolated linearly between neighbouring values in input time series
- equaltolast value in output time series is equal to last available value in input time series.
...
The set of rule based transformations is a library of specific data transformation functions. Configuration of the rule based transformation is the same as in the Arithmetic transformation. However, each rule may have specific requirements on the elements that need to be defined. Many parameters that affect the transformation will need to be defined as a coefficient, using the appropriate coefficientType definition.
The rule based transformations can be grouped into four main sections;
- Selection of peak or low values from a time series.
- Resampling a equidistant time series set using time values specified in a non-equidistant time series set.
- Data hierarchy
- Various transformations.
Selection of peak or low flow values
Selection of peaks and lows
Set of rules to allow selection of peaks and lows from an input time series.
Enumerations in the rule attribute of the ruleBasedTransformation element;
- selectpeakvalues
- selectlowvalues
- selectpeakvalueswithincertaingap
- selectlowvalueswithincertaingap
__
The first two enumerations will select all peaks or lows in the time series. The second two will select peaks only if there is a defined gap in time between peaks. If not they are considered to be of dependent and only the highest peak of the dependent sets will be returned.
Requirements for definitions of peak selections using gaps to define independence are;
- A coefficientId "a" must be defined. The coefficientType must be set to "gaplengthinsec". The value attributed defines the length of the minimum gap in seconds.
Example:
Sampling values from equidistant time series
This section of the rule based transformation can be applied to sample items from an equidistant time series at the time values in a non-equidistant time series. This may be required when applying transformations to a non-equidistant time series. The values to add will first need to be resampled to the right time value. An example is when wind and wave information is required at the time of the tidal peaks for entry in a lookup table.
Enumerations in the rule attribute of the ruleBasedTransformation element;
- equitononequidistant
- equitononequidistantforinstantaneousseries
- equitononequidistantforaccumulativeseries
__
The first two elements are equivalent. The last will consider accumulations of the input variable up to the time value sampled.
Requirements for definitions of resampling equidistant time series are;
- The limitVariableId attribute of the segements element must be the non-equidistant time series which determines the time values at which the equidistant series is to be sampled.
- The userDefinedFunction must contain the equidistant time series to be sampled
- The outputVariableId must resolve to a non-equidistant time series.
__
Example:
Data Hierarchy
This is a simple method to merge overlapping equidistant time series in a single equidistant series. Gaps in foremost (first) series will be filled with data of second series if a valid value is available at the current time step, otherwise the gap is filled with data from the third series and so on until no more time series are available. Only missing data values and unreliable values are filled. Doubtful values remain in the result series as doubtful.
Figure 61 Schematic example of merging series using data hierarchy.
In example above Series 1 is the most important time series, Series 2 has a lower hierarchy and series 3 has the lowest hierarchy. The resulting time series has values from all 3 series as shown in figure above.
Data hierarchy poses no specific requirements to variables defined. Only the Id of the output variable is of importance.
Creating time series from typical profiles
Typical profiles can be defined in the inputVariable as described above. To use a typical profile it must first be mapped to a dynamic time series. This can then be retrieved in a later configuration of a module for use.
Enumerations in the rule attribute of the ruleBasedTransformation element;
- typicalprofiletotimeseries:
- datatotimeseries
The first type of mapping is used when the typical profile has a concept of date/time (e.g. must be mapped to specific dates or time values). The second is used when only a series of data is given. The time series is then filled with the first data element given as the first time step of the relative view period to be created.
Typical profile mapping poses no specific requirements to variables defined. Only the Id of the output variable is of importance.
outputVariable
Definition of the output variables to be written following transformation. See the inputVariable for the attributes and structure. The output variable can only be a TimeSeriesSet (typical profiles are only used as inputs). The OutputVariable is assigned an ID. This ID must be defined as the result of the transformation.