Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

scrollbar

Table of Contents

Introduction

One of the most important properties of DELFT FEWS as a forecasting system is its ability to efficiently deal with large volumes of dynamic data. Dynamic data covers mainly time series data in various formats (scalar- 0D, vector - 1D, longitudinal profile - 1D and 2D, grid - 2D and 3D, and polygon data - 2D). Dynamic data also includes the management of model states produced by the system. As Delft-FEWS is not only used as operational forecasting system, but also as a system to run climate scenario's the length of each time series can be up to 2,000,000,000 time steps.

...

To allow handling of time series data, the concept of a "Time Series Set" is introduced. A Time Series Set is used to retrieve and submit data to the database. In this chapter the concept of the time series set is explained.

Types of time series

External and Simulated time series

Time series are considered to be available from two sources. All time series sourced from external systems are considered as "External". All time series produced by the forecasting system itself are considered as "Simulated".

Forecast and Historical time series

Time series are considered to be of two categories in relation to time. Historical time series are continuous time series that describe a parameter at a location over a period of time. Forecast time series are different to historical time series in that for each location and parameter one forecast is independent of another forecast. A forecast is characterised by its start time and the period it covers. Generally when a new forecast is available for a given location and parameter it will supersede any previous forecast for that location parameter. Each forecast is therefore an independent entity.

...

There are significant differences in how each of these time series are handled.

External Historical time series

In an online system DELFT-FEWS will incrementally import observed data as it becomes available from external systems. This data should be imported as an External Historical time series. When data marked as external historical is presented to the system with exactly the same values and covering the same period as data for that location/parameter already available in the database then it will be ignored. Only new data is imported and stored. If data for a given period is already available but is changed (manual edit or update), then the new values will be added to the database. For each item of data added to the database, a time stamp is included to specify when the data was made available to the system.

...


Figure 1 Schematic representation of data imported as external historical

External Forecasting time series

External forecasts are imported by DELFT-FEWS as these are made available by the external forecasting systems. Again each forecast is imported and stored individually. External forecasts are referenced by the start time of that forecast. When retrieving an external forecast time series from the database, the most recent available forecast, as indicated by the forecast start time will be returned. The most recently available forecast is determined as the latest forecast with a start time earlier or equal to the start of the forecast to be made using DELFT-FEWS (forecast T0). It is thus not possible to see an external forecast time series on request, as the latest available is always returned.

With possible exceptions for modules considering multiple forecasts (e.g. performance module), only one external forecast is returned. Different external forecasts are not merged.

Simulated Historical time series

Simulated historical time series are similar to the external historical time series in that they are continuous in time. The difference is that the time series are referenced through the forecast (model) run they have been produced by. As a consequence the time series can be retrieved either by directly requesting it through opening the run and viewing, or if the run is approved. If you use an extended relativeViewPeriod and the readWriteMode "read only" with a simulated historical time series, you can access the combined results of several model runs (within the specified relativeViewPeriod), similar to the default behavior of the merged external historical time series. If you use readWritemode 'read complete forecast' without a relativeViewperiod, you will only obtain the Current forecast.

Simulated historical time series are generally produced by model runs where a model initial state is used. Each time series has a history, i.e. the state used as its initial condition. Each state again has a history, i.e. the model run that produced the state. This history is used by the database in constructing a continuous time series.

Simulated Forecasting time series

Simulated forecast time series are again similar to external forecasting time series. Again the main differences is that they are referenced through the forecast (model) run they have been produced by. As a consequence the time series can be retrieved either by directly requesting it through opening the run and viewing, or if the run is approved. Simulated forecast time series are treated in the same way as the external forecast time series in that the last approved forecast (referred to as the current forecast) is seen as a default. All other runs can be seen on request only. Note that the last approved forecast which is shown by default may not be the last available forecast.

...

Info

The time series type simulated historical should only be assigned to time series that have a relation to a previous time series through a model state. In all other cases, the time series is independent, and should be allocated simulated forecasting as time series type. 

Temporary time series

Temporary time series are deleted at the end of a run. They are only visible for the running task run.

Temporary External Forecasting time series (since 2017.01)

This replaces an external forecast with synch level 9 with a parameter that is only used in combination with synch level 9 and not in the final result. 
Temporary time series are deleted at the end of a run. They are only visible for the running task run. You can now use the same parameter as used in the final result

Time Series Sets

Any module in DELFT-FEWS that requires data from the database, or produces data that must be stored in the database, does so through the use of a complex data type referred to as the Time Series Set. A time series set can be compared to a query that is run against the database. It contains all the keys to uniquely identify the set of data to be retrieved (for more information on key attributes, see Key attributes).

...

Figure3 Schema of the Time Series Set Complex type

description

This is an optional description for the time series set. It is only used a caption in configuration and is not stored with time series.

moduleInstanceId/moduleInstanceSetId

The module instance Id is the ID of the module that has written the data in the time series set to the database. This ID is one of the primary keys and is required to uniquely identify the data on retrieval.

...

One or more moduleInstanceId may be defined, or a single ModuleInstanceSetId. These cannot be mixed

valueType

This specifies the dimension/data type of the time series. This element is an enumeration of the next types;

...

  •         polygon
  •         sample
parameterId

The parameterId describes the parameter of the data in the time series. This Id is a cross reference to the Parameters.xml configuration file in the regional configuration defining the parameters. The reference is not enforced through an enumeration in the XML schema. If a parameter not included in the parameter definition is referred to, an error will be generated and an appropriate message returned.

locationId/locationSetId

The locationId is a reference to the location for which the data series is valid. Each individual data series may belong to one location only. In the time series set a single location may be referenced or multiple locations may be referenced. The latter is done either by including a list of locationId's or by referencing a locationSetId. This again resolves to a list of locationId's as defined in the LocationSets.xml configuration file.

One or more locationId's may be defined, or a single locationSetId. These cannot be mixed.

timeSeriesType

This specifies the type of time series (see discussion above). This is an enumeration of;

...

  •         simulated forecasting
timeStep

This is the time step of the time series. The time step can be either equidistant or non-equidistant. The time step is defined in the parameters of the timeStep element;

...

  •         timeZone defines the timeZone of the timeStep, this is only relevant for units of a day or larger.

    Note

    For hourly timesteps this may also be relevant in the case of half-hourly timezones. Untested at the moment.


    For more information, have a look at the TimeStep Documentation

relativeViewPeriod / relativeForecastPeriod

The relative view period defines the span of time for which data is to be retrieved. This span of time is referenced to the start time of the forecast run (T0) the time series set is used in. If the time series set is not used in a forecast run (e.g. in the displays), then the reference is to the DELFT-FEWS system time.

...


Figure 4 Schematic representation of the relative view period with reference to the T0. The start and end time defined may be overruled if the appropriate parameters are set to true.

cycle

If cycle specified, then the data is repeated periodically with this cycle as the period length. There is original data for only one cycle. After that cycle the data will be repeated periodically. In other words, when cycle is defined, then a missing value is filled up with a value from the last available cycle before the missing value.

externalForecastMaxAge

when the externalForecastMaxAge is not configured there is no maximum age for a forecast series to be used, so the returned external forcast can be very old when there is no recent forecast available. ALL external forecasts after the T0 are ALWAYS ignored. The age of an external forecast is defined as the time span between the external forecast time and T0.

...

  •         divider same function as the multiplier, but defines fraction of units.
externalForecastTimeCardinalTimeStep

When no external forecast exists in the data store younger than the specified age, a new external forecast is returned with a minimum age that applies to the specified cardinal time step.

...

  •         timeZone defines the timeZone, this is only relevant for units of a day or larger.
readWriteMode

The readWriteModel definition is mainly used in the definition of filters to be applied in the time series display when used in edit mode. This element is an enumeration of;

...

It is a good convention to set this property to read only in all input blocks.

synchLevel

This is an integer value determining how the data is stored and synchronised through the distributed system. There is no enumeration as the synchLevel is used in the configuration of synchronisation, where optimisations can be defined for each synchLevel. The convention used is explained in the Live System configuration section.

expiryTime

This element allows the time series created to have a different expiry time to the default expiry time. This means it may be removed earlier, or later, by the rolling barrel function. For temporary series the value may be set to a very brief period. For other time series (e.g. Astronomical input series), the value should be set sufficiently high.

...

  •         divider same function as the multiplier, but defines fraction of units.
delay

This element allows the time series retrieved to be lagged (positive or negative). The time stamps of the series will then be shifted by the period specified on retrieval. This is used only when retrieving time series from the database, and not inversely when submitting time series to the database.

...

  •         divider same function as the multiplier, but defines fraction of units.
multiplier

This element allows the time series retrieved to be multiplied by the factor given. This is used only when retrieving time series from the database, and not inversely when submitting time series to the database.

divider

This element allows the time series retrieved to be divided by the factor given. This is used only when retrieving time series from the database, and not inversely when submitting time series to the database.

incrementer

This element allows the time series retrieved to be incremented by the factor given. This is used only when retrieving time series from the database, and not inversely when submitting time series to the database.

ensembleId

A time series set may be defined to retrieve all members of an ensemble at once (for example in evaluation of ensemble statistics). This is done by defining the optional ensembleId. The ensembleId should also be defined when writing new ensemble members. (e.g. on importing ensembles in the import module).

...

Info

When dealing with ensembles, the ensembleId needs only be defined if the workflow activity that is used must retrieve the complete ensemble, or if members are to be written under a new ensembleId. In other cases the ensembleId needs only be defined in the workflow definition (see Workflows chapter). For the TimeSeriesSets defined in modules there is then no difference between running in ensemble mode and running normally.

Anchor
visibilityControllingFlagSourceColumnId
visibilityControllingFlagSourceColumnId
visibilityControllingFlagSourceColumnId

By defining a 'visibilityControllingFlagSourceColumnId' element in the timeSeriesSet, the existence of a flagSource Column for a particular timeseries (step) can be used as a condition for for example a transformation. When the configured  flagSourceColumn is not existing for a particular timeseries step, the value will be perceived as 'Missing' for the transformation. All values that did not pass this validation step are removed on read. They are set to missing for equidistant timeseries and removed for non-equidistant timeseries. The flagSourceColumnId needs to be defined in flagSourceColumn.xml configuration file.

Anchor
KeyAttributes
KeyAttributes

Key attributes

Key attributes are those attributes of a timeSeriesSet element, that distinguishes one timeSeriesSet from another. Key attributes are:

...

Other attributes of the timeSerieSet only define what part of the whole timeserie Delft-FEWS should process (eg relativeViewPeriod), or what temporary transformation should be applied (eg incrementer).

Manual data edits

Timeseries can be edited by the user. This can be done from the explorer (filters.xml) or from the predefined displays (displaygroups.xml). It is required that the ReadWriteMode of the TimeSeriesSet is not set to "read only".There are some characteristics concerning handling of manually edited timeseries.

Automatic overwrite of manually edited values

  • Manually corrected (not completed) imported values will not be rollbacked when the value is reimported.
  • Values added in the period after T0 in an external historical timeserie, WILL be overwritten by imported values.
  • All other modules that generate the timeserie (such as a transformation) WILL overwrite manually edited data.

Expiry time of manually edited values

  • A manually edited value will get the expiry time of the timeseriesset it is in, in the filters.xml or displaygroups.xml (depending on from which display the edit session started).
  • If there is not expiryTime configured, then the default expirytime from the global.properties file will be taken.
  • If this expiry time is not set, a default of 10 days is used.

...