Within Delft-FEWS, the datasets are exported to the archive in a workflow using the ExportArchiveModule. This workflow needs to be scheduled on a regular interval to archive all relevant data.Simulated data sets are preferably exported to archive from the workflow which created the simulation. This is especially the case when modifiers are used in the simulation. When a simulation is not exported to the archive directly it might be that some modifiers which are used in the simulation are deleted are changed in between. This will cause that it won't be possible to archived all the used modifiers correctly.
To prevent dependencies on other processes, the ExportArchiveModule is envisioned to write directly into the archive file storage. The FSS thus needs to be able to have write access to those disks.
For each kind of dataset, the ExportArchiveModule checks the database for changes over a (configured) relative period. It exports any data which meets the export instructions and has changed within this period. Datasets are archived in a pre-defined directory structure, which is based on areaId, date and dataset.
The schema of the associated configuration file (Figure 4.1) is defined at:
Figure 4.1 Top level of Delft-FEWS exportArchiveModule.xsd
For observations a dataset is generated for every area on a daily basis. The associated directory structure of the Delft-FEWS export for this type of dataset data set is as follows:
When Delft-FEWS generates the netcdf file, data is written to the same data block when the entire matrix is filled, i.e. all time steps are regular and none of the values is missing. For those locations with irregular time stamps, or missing values, a separate data block is used.
Within the netCDF file, each data block is accompanied by a header. Within each header a metadata item called 'timeseries_sets_xml' is included, holding the exact definition of the timeSeriesSet as the data was stored in the FEWS database. This feature allows full reproducibility of the time series via an Import workflow in a Delft-FEWS stand alone application, assuming that the associated Delft-FEWS configuration is in place.
The relative period used in the export of observed data sets requires some additional explanation. The relative period defines which daily data sets are written to archive. The export of observed data is usually scheduled once a day after midnight when all the data for the prior day are imported and available in FEWS. Because it is common to configure a data export of several days this means that data sets which are written earlier will be overwritten. This is done for several reasons. First all this ensure that when a data export was not executed for several days because of an system error this missing data sets will be added during the next run. In addition this will also allow that data edits will be stored in the archive. To ensure that data edits will be stored in the archive you must be sure that time window which you configure in your relative period is large enough to capture the data edits. However it very important that the relative period will be too large! If the beginning of the relative period starts at a time that data is already expired from the archive you will write missing values to the archive and erase the data in your archive! A relative period of -10 to -1 is usually sufficient for most systems.
The exportArchiveModule.xsd has a dedicated exportObserved section to configure the observed timeseries that need to be archived (see Figure 4.2). Table 4.1 documents the associated elements:
Export destination folder, assumes that the account running the FEWS (FSS) application has write access
Exports entire dataset by day, for any day where a database change (blob creation time) is detected within the relativePeriod (relative to T0). Existing timeseries files are overwritten*
idMap applied to translate internal FEWS identifiers to identifiers that meet NetCDF-CF criteria.E.g. netcdf does not allow a full stop ('.') in the variable name
without nc extension, preferably no spaces
area to which the dataset belongs
optional metadata tags within NetCDF file following CF convention. Supported by the internal catalogue of the THREDDS Data Server
default=TRUE; if TRUE, a list of flags is stored, each value pointing to the associated flag
default=TRUE; if TRUE, a list of comments is stored, each value pointing to the associated comment
identifies FEWS ThresholdGroup which is used to detect threshold crossings to be highlighted in the metaData.xml
FEWS timeseries sets