Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.




scrollbar

Table of Contents

Introduction

One of the most important properties of DELFT Delft-FEWS as a forecasting system is its ability to efficiently deal with large volumes of dynamic data.  Dynamic Dynamic data covers mainly time series data in various formats (scalar- 0D, vector - 1D, longitudinal profile - 1D and 2D, grid - 2D and 3D, and polygon data - 2D). Dynamic data also includes the management of model states produced by the system. As Delft-FEWS is not only used as operational forecasting system, but also as a system to run climate scenario's scenarios, the length of each time series can be up up to 2,000,000,000 time steps. A thorough understanding of how DELFTDelft-FEWS handles dynamic data is fundamental in the correct configuration of to correctly configuring an operational system. For each of the different types Specific optimisations are available for each type of dynamic data specific optimisations have been introduced.To allow handling of time series data, . This chapter introduces the concept of a "Time Series Set" is introduced. A Time Series Set is used to retrieve data from and submit data to the database. In this chapter the concept of the time series set is explained.

Types of time series

External and Simulated time series

Time series are considered to be available from two sources. All time series sourced from external systems are considered as "External". All time series produced by the forecasting system itself are considered as "Simulated".

Forecast and Historical time series

Time series are considered to be of two categories in relation to time. Historical time series are continuous time series that describe a parameter at a location over a period of time. Forecast time series are different to historical time series in that for each location and parameter one forecast is independent of another forecast. A forecast is characterised by its start time and the period it covers. Generally, when a new forecast is available for a given location and parameter, it will supersede any previous forecast for that location and parameter. Each forecast is therefore an independent entity.

On the basis of this, four six categories of time series are identified;

...

There are significant differences in how each of these time series are handled in Delft-FEWS.

External Historical time series

In an online operational system DELFT, Delft-FEWS will incrementally import imports observed data as it becomes available from external systems. This These data should be imported as an External Historical time series. When data marked as external historical is presented to the system with exactly the same values and covering the same period as data for that location/parameter already available in the database, then it will be ignored (i.e. not imported). Only new data is are imported and stored. If data for a given period is already available but is changed (manual edit or update), then the new values will be added to the database. For each item of data added to the database, a time stamp is included to specify when the data was made available to the system.

When data of the external historical type is data are requested from the database, the most recently available data over that whole period is returned. If the data for that period was imported piecewise, then the individual pieces will be merged prior to the data being returned. An example is given in Figure 1, where data is are imported sequentially. Each data dataset imported/edited is indicated using a different line style. At When the request for the complete series (a) is requested, the most recent data available over the complete period is merged and returned. The data imported at 12:00 partially overlaps that imported at 10:00. As Since the 12:00 data is the most recent, it will persist in the complete series. A manual edit may be done (or interpolation) may be done to fill the gap in the data. This will be returned in a subsequent request for the complete series. Although a complete series is returned, the data is stored as it is imported, including a time stamp indicating when the import happened. If at a later stage the data available at directly preceding the manual edit is requested, then the additional data will not be included in the complete series.

...

External forecasts are imported by DELFTDelft-FEWS as these they are made available by the other, external forecasting systems. Again each Each forecast is imported and stored individually. External forecasts are referenced by the start time of that the forecast. When retrieving an external forecast time series from the database, the most recent recently available forecast, as indicated by the forecast start time, will be returned. The most recently available forecast is determined as the latest forecast with a start time earlier or equal to the start of the forecast to be made using DELFTDelft-FEWS (forecast T0). It is thus not possible to see an external forecast time series on request, as the latest available is always returned.

...

Simulated historical time series are similar to the external historical time series in that they are continuous in time. The difference is that the time series are referenced through the forecast (model) run they have been produced by. As a consequence the time series can be retrieved either by directly requesting it through opening the run and viewing, or if the run is approved. If you use an extended relativeViewPeriod and the readWriteMode "read only" with a simulated historical time series, you can access the combined results of several model runs (within the specified relativeViewPeriod), similar to the default behavior of the merged external historical time series. If you use readWritemode readWriteMode 'read complete forecast' without a relativeViewperiodrelativeViewPeriod, you will only obtain the Current current forecast.

Simulated historical time series are generally produced by model runs where a model initial state is used. Each time series has a history, i.e. the state used as its initial condition. Each state again has a history, i.e. the model run that produced the state. This history is used by the database in constructing a continuous time series.

...

Figure2 schematically shows how a sequence of runs producing simulated historical and simulated forecasting time series are stored. Each simulated historical run uses the module state saved at the end of the previous run. It can be seen that these simulated historical traces are treated as a continuous time series when requested later. For the forecasting time series, only the most recent (approved) time series is displayed.


Figure2 Figure 2 Schematic overview of handling simulated forecasting and simulated historical time series. Three subsequent forecasts are shown, and the resulting complete time series returned when requested after 12:00. The historical time series is traced back using the state used to create the link to a previous run. For the forecast time series the most recent forecast supersedes previous forecasts.

...

Time Series Sets

Any module in DELFTDelft-FEWS that requires data from the database, or produces data that must be stored in the database, does so through the use of a complex data type referred to as the Time Series Set. A time series set can be compared to a query that is run against the database. It contains all the keys to uniquely identify the set of data to be retrieved (for more information on key attributes, see Key attributes).

...

Optional items may also need to be required to fulfil fulfill the requirements of the module using the time series set. This will be indicated in this manual for those modules where appropriate.

...

This specifies the type of time series (see discussion above). This is an enumeration of;

  •         external historical
  •         external forecasting
  •         simulated historical
  •         simulated forecasting
  •         temporary (timeseries will not be stored in the database, but will be available for later processes in the same workflow)
  •         temporary external forecasting (same as previous, but has an externalForecastingtime)
timeStep

This is the time step of the time series. The time step can be either equidistant or non-equidistant. The time step is defined in the parameters of the timeStep element;

...

The relative view period defines the span of time for which data is to be retrieved. This span of time is referenced to the start time of the forecast run (T0) the time series set is used in. If the time series set is not used in a forecast run (e.g. in the displays), then the reference is to the DELFTDelft-FEWS system time.

Parameters

...

Info

This start and end time are the DEFAULT time span of data. Both can be overruled through user selection, there is no restriction on shortening the start time through user selection (since 2015.01).

...



Figure 4 Schematic representation of the relative view period with reference to the T0. The start and end time defined may be overruled if the appropriate parameters are set to true.

cycle

If cycle is specified, then the data is repeated periodically with this cycle as the period length. There is original data for only one cycle. After that cycle the dates specified in the cyclic time series the data will be repeated periodically. In other words, when cycle is defined, then a missing value is filled up with a value from the last available cycle before the missing value. Be aware that retrieveing data from a cyclic timeseries requires a relative view period to ensure that the timestamps of the array are properly set. If you retrieve a cyclic timeseries with the 'read complete forecast' readWriteMode, the array read will have timestamps that correspond to the original data. 

externalForecastMaxAge

when the externalForecastMaxAge is not configured there is no maximum age for a forecast series to be used, so the returned external forcast can be very old when there is no recent forecast available. ALL external forecasts after the T0 are ALWAYS ignored. The age of an external forecast is defined as the time span between the external forecast time and T0.

...

  •         timeZone defines the timeZone, this is only relevant for units of a day or larger.
readWriteMode

The readWriteModel definition is mainly used in the definition of filters to be applied in the time series display when used in edit mode. This element is an enumeration of;

  •         read only implies the data cannot be edited.
  •         add originals implies the data is new and is added to the database.
  •         editing only visible to current task runs implies any changes made remain invisible to other tasks (used in What-If scenarios)
  •         editing visible to all future task runs implies any changes made will be visible to other tasks
  •         read originals only implies all edited, corrected or interpolated data should be ignored.

The only enumeration that can be used in timeseriessets in FEWS modules is:

  •         read complete forecast reads the complete forecast series from the database. If this enumeration element is used, no Relative View Period has to be configured

It is a good convention to set this property to read only in all input blocks.

synchLevel

This is an integer value determining how the data is stored and synchronised through the distributed system. There is no enumeration as the synchLevel is used in the configuration of synchronisation, where optimisations can be defined for each synchLevel. The convention used is explained in the Live System configuration section.

expiryTime

This element allows the time series created to have a different expiry time to the default expiry time. This means it may be removed earlier, or later, by the rolling barrel function. For temporary series the value may be set to a very brief period. For other time series (e.g. Astronomical input series), the value should be set sufficiently high.

Attributes;

  •         unit (enumeration of: second, minute, hour, day, week)
  •         multiplier defines the number of units given above.
  •         divider same function as the multiplier, but defines fraction of units.
delay

This element allows the time series retrieved to be lagged (positive or negative). The time stamps of the series will then be shifted by the period specified on retrieval. This is used only when retrieving time series from the database, and not inversely when submitting time series to the database.

Attributes;

  •         unit (enumeration of: second, minute, hour, day, week)
  •         multiplier defines the number of units given above.
  •         divider same function as the multiplier, but defines fraction of units.
multiplier

This element allows the time series retrieved to be multiplied by the factor given. This is used only when retrieving time series from the database, and not inversely when submitting time series to the database.

divider

This element allows the time series retrieved to be divided by the factor given. This is used only when retrieving time series from the database, and not inversely when submitting time series to the database.

incrementer
externalForecastSearchTimeStep

Since 2024.01. Only time series with an external forecast time that match this time step are visible while searching, e.g. an externalForecast timeseries that has a  externalForecastTime of 18:00 in the GMT timeZone.

All timestep attributes are available, probably most relevant options are;

  •        id  Id of the time step. You can reference time steps defined in the regionConfig/timeSteps.xml.    
  •        times Defines the time step by a list of times without dates, e.g. "10:00 23:00"
  •         timeZone defines the timeZone, this is only relevant for units of a day or larger.


Code Block
languagexml
<timeSeriesSet>
	<moduleInstanceId>ImportACCESS-GE</moduleInstanceId>
	<valueType>grid</valueType>
	<parameterId>P.nwp.fcst</parameterId>
	<locationId>ACCESS-GE_grid</locationId>
	<timeSeriesType>external forecasting</timeSeriesType>
	<timeStep unit="hour"/>
	<relativeForecastPeriod unit="hour" start="0" end="246"/>
	<externalForecastSearchTimeStep times="18:00" timeZone="GMT"/>
	<readWriteMode>read complete forecast</readWriteMode>
</timeSeriesSet>


readWriteMode

The readWriteMode definition is mainly used in the definition of filters to be applied in the time series display when used in edit mode. This element is an enumeration of;

  •         read only implies the data cannot be edited.
  •         add originals implies the data is new and is added to the database.
  •         editing only visible to current task runs implies any changes made remain invisible to other tasks (used in What-If scenarios)
  •         editing visible to all future task runs implies any changes made will be visible to other tasks
  •         read originals only implies all edited, corrected or interpolated data should be ignored.

The only enumeration that can be used in timeseriessets in FEWS modules is:

  •         read complete forecast reads the complete forecast series from the database. If this enumeration element is used, no Relative View Period has to be configured

It is a good convention to set this property to read only in all input blocks.

synchLevel

This is an integer value determining how the data is stored and synchronised through the distributed system or filtered when  a database snapshot is created. There is no enumeration as the synchLevel is used in the configuration of synchronisation, where optimisations can be defined for each synchLevel. The convention used is explained in the Live System configuration section. Synclevel 0, 1, 2, 5, 6, 9 are automatically assigned when no synch level is configured depending on the the time series type and value type. see B Enumerations

expiryTime

This element allows the time series created to have a different expiry time to the default expiry time. This means it may be removed earlier, or later, by the rolling barrel function.

Attributes;

  •         unit (enumeration of: second, minute, hour, day, week)
  •         multiplier defines the number of units given above.
  •         divider same function as the multiplier, but defines fraction of units.
delay

This element allows the time series retrieved to be lagged (positive or negative). The time stamps of the series will then be shifted by the period specified on retrievalThis element allows the time series retrieved to be incremented by the factor given. This is used only when retrieving time series from the database, and not inversely when submitting time series to the database.

ensembleId

A time series set may be defined to retrieve all members of an ensemble at once (for example in evaluation of ensemble statistics). This is done by defining the optional ensembleId. The ensembleId should also be defined when writing new ensemble members. (e.g. on importing ensembles in the import module).

Example:

No Format
<timeSeriesSet>
	<description>Example time series set</description>
	<moduleInstanceId>ImportTelemetry</moduleInstanceId>
	<valueType>scalar</valueType>
	<parameterId>H.obs</parameterId>
	<locationId>4539.TF</locationId>
	<timeSeriesType>external historical</timeSeriesType>
	<timeStep unit="minute" multiplier="15"/>
	<relativeViewPeriod unit="hour" start="-48" end="24" endOverrulable="true"/>
	<readWriteMode>read only</readWriteMode>
	<synchLevel>1</synchLevel>
	<expiryTime unit="day" multiplier="100"/>
</timeSeriesSet>
Info

When dealing with ensembles, the ensembleId needs only be defined if the workflow activity that is used must retrieve the complete ensemble, or if members are to be written under a new ensembleId. In other cases the ensembleId needs only be defined in the workflow definition (see Workflows chapter). For the TimeSeriesSets defined in modules there is then no difference between running in ensemble mode and running normally.

...

Attributes;

  •         unit (enumeration of: second, minute, hour, day, week)
  •         multiplier defines the number of units given above.
  •         divider same function as the multiplier, but defines fraction of units.
multiplier

This element allows the time series retrieved to be multiplied by the factor given. This is used only when retrieving time series from the database, and not inversely when submitting time series to the database.

divider

This element allows the time series retrieved to be divided by the factor given. This is used only when retrieving time series from the database, and not inversely when submitting time series to the database.

incrementer

This element allows the time series retrieved to be incremented by the factor given. This is used only when retrieving time series from the database, and not inversely when submitting time series to the database.

ensembleId

A time series set may be defined to retrieve all members of an ensemble at once (for example in evaluation of ensemble statistics). This is done by defining the optional ensembleId. The ensembleId should also be defined when writing new ensemble members. (e.g. on importing ensembles in the import module).

Example:

Code Block
languagexml
titleExample
		<timeSeriesSet>
            <moduleInstanceId>Import_NWS_GEFS</moduleInstanceId>
            <valueType>grid</valueType>
            <parameterId>P.forecast</parameterId>
            <locationId>GEFS</locationId>
            <timeSeriesType>external forecasting</timeSeriesType>
            <timeStep unit="hour" multiplier="6"/>
            <readWriteMode>add originals</readWriteMode>
            <ensembleId>GEFS</ensembleId>
        </timeSeriesSet>


Info

When dealing with ensembles, the ensembleId needs only be defined if the workflow activity that is used must retrieve the complete ensemble, or if members are to be written under a new ensembleId. In other cases the ensembleId needs only be defined in the workflow definition (see Workflows chapter). For the TimeSeriesSets defined in modules there is then no difference between running in ensemble mode and running normally.

Anchor
visibilityControllingFlagSourceColumnId
visibilityControllingFlagSourceColumnId

visibilityControllingFlagSourceColumnId

By defining a 'visibilityControllingFlagSourceColumnId' element in the timeSeriesSet, the existence of a flagSource Column for a particular timeseries (step) can be used as a condition for for example a transformation. When the configured  flagSourceColumn is not existing for a particular timeseries step, the value will be perceived as 'Missing' for the transformation. All values that did not pass this validation step are removed on read. They are set to missing for equidistant timeseries and removed for non-equidistant timeseries. The flagSourceColumnId needs to be defined in flagSourceColumn.xml configuration file.

onlyReliableFlagSourceColumnId

By defining a 'onlyReliableFlagSourceColumnId' element in the timeSeriesSet, the existence of a flagSource Column for a particular timeseries (step) can be used as a condition for example a transformation or explorer filter, but then only for reliable data. When the configured flagSourceColumn is not existing for a particular timeseries step, the value will be perceived as 'Missing' for the transformation. All values that did not pass this validation step are removed on read. They are set to missing for equidistant timeseries and removed for non-equidistant timeseries. The flagSourceColumnId needs to be defined in flagSourceColumn.xml configuration file. This works the same as the above element visibilityControllingFlagSourceColumnId but then only for relaible data, it can be used together with a visibilityControllingFlagSourceColumnId, then the data needs to have either the visibilityControllingFlagSourceColumnId (the quality flag does not matter) or it need to have the onlyReliableFlagSourceColumnId and the data needs to be reliable.

qualifierAggregation

Can have values sum, mean, min or max.

When specified all the time series that have ALL (in the same time series set) specified qualifiers are aggregated to a single time series on read. Can not be specified when this time series set is meant for writing.

...

visibilityControllingFlagSourceColumnId

By defining a 'visibilityControllingFlagSourceColumnId' element in the timeSeriesSet, the existence of a flagSource Column for a particular timeseries (step) can be used as a condition for for example a transformation. When the configured  flagSourceColumn is not existing for a particular timeseries step, the value will be perceived as 'Missing' for the transformation. All values that did not pass this validation step are removed on read. They are set to missing for equidistant timeseries and removed for non-equidistant timeseries. The flagSourceColumnId needs to be defined in flagSourceColumn.xml configuration file.

Anchor
KeyAttributes
KeyAttributes

Key attributes

Key attributes are those attributes of a timeSeriesSet element, that distinguishes one timeSeriesSet from another. Key attributes are:

...

  • Manually corrected (not completed) imported values will not be rollbacked rolled back when the value is reimported.
  • Values added in the period after T0 in an external historical timeserietimeseries, WILL be overwritten by imported values.
  • All other modules that generate the timeserie timeseries (such as a transformation)  WILL WILL overwrite manually edited data.

...

  • A manually edited value will get the expiry time of the timeseriesset it is in, in the filters.xml or displaygroups.xml (depending on from which display the edit session started).
  • If there is not expiryTime configured, then the default expirytime from the global.properties file will be taken.
  • If this expiry time is not set, a default of 10 days is used.

Save

Save

Save