Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

What

nameofinstance.xml

Description

Configuration for import module

schema location

httphttps://fewsfewsdocs.wldelftdeltares.nl/schemas/version1.0/timeSeriesImportRun.xsd

Table of Contents

Time Series Import Module

...

Code Block
xml
xml
<general>
      <importTypeStandard>database</importTypeStandard>
      <jdbcDriverClass>com.mysql.jdbc.Driver</jdbcDriverClass>
      <jdbcConnectionString>jdbc:mysql://192.168.101.215/cwb_ac</jdbcConnectionString>
      <user>sobek</user>
      <password>Tohek>cwa</password>
      <relativeViewPeriod startOverrulable="true" endOverrulable="true" start="-1" end="1" unit="day"/>
      <table name="qpe_sums_obs">
        <dateTimeColumn name="rehdate"/>
        <valueColumn name="rad_gz" unit="mm/hr" locationId="Qesums" parameterId="P.radar.actual" parser="Mosaic"/>
      </table>
      <table name="qpe_sums_fo">
        <forecastDateTimeColumn name="createdate"/>
        <dateTimeColumn name="raddate"/>
        <valueColumn name="rad_gz" unit="mm/hr" locationId="Qpesums" parameterId="P.radar.forecast" parser="Mosaic"/>
      </table>
      <unitConversionsId>ImportUnitConversions</unitConversionsId>
      <importTimeZone>
        <timeZoneOffset>+00:00</timeZoneOffset>
      </importTimeZone>
      <dataFeedId>QPE_Sum</dataFeedId>
    </general>

sftp example:

Code Block
xml
xml
<general>
      <importType>TypicalAsciiForecast</importType>
      <folder>sftp://wf:2004wrf@130.167.66.114:22/home/WRF</folder>
      <relativeViewPeriod startOverrulable="true" endOverrulable="true" start="-1" end="3" unit="day"/>
      <unitConversionsId>ImportUnitConversions</unitConversionsId>
      <importTimeZone>
        <timeZoneOffset>+00:00</timeZoneOffset>
      </importTimeZone>
      <dataFeedId>Forecast</dataFeedId>
    </general>

...

Please note that the  triggering files cannot be located on the FTP.fileNamePatternFilter

groupImportPattern

Since 2024.01 - Option to filter specify groups of files that need to be imported based on their filenametogether. This can be useful if there are files in the import folder that should not be imported:

e.g *.xml to skip non xml files.

Another usecase is when you are importing a large number of big files which FEWS cannot import all at once due to memory issues (eg reanalysis data). In this case you can use the T0 of the system to filter out files you want to import in one run:

eg: <fileNamePatternFilter>%TIME_ZERO(yyyy)%??????.nc</fileNamePatternFilter>

fileNameObservationDateTimePattern

'IMAGE_'yyyyMMdd_HHmmss'.jpg' This will overrule the observation time stored in the file, some grid formats don't contain the time at all, so for these files the pattern is required. Put the literal parts of the pattern between single quotes.  If the litteral parts contain unrelated numbers or an unrelated date, replace each digit with a ?. The ? should still be between the single quotes.

Example: T20190524_20190722 →  'T????????_'yyyyMMdd.

Currently supported only by import from OpenDAP.
fileNameForecastCreationDateTimePattern

'IMAGE_'yyyyMMdd_HHmmss'.jpg' This will overrule the forecast time stored in the file, some grid formats don't contain the forecast time at all, so for these files the pattern is required. Put the literal parts of the pattern between single quotes. When the file name also contains non forecast date times. Put this parts between ' and use ? wildcard. All forecast with the same forecast time belong to the same forecast, the "fileNameObservationDateTimePattern" is not required in that case.

Example: MSPM_<observation date time>_<forecast date time>.asc → 'MSPM_??????????????_'yyyyMMddHHmmss'.asc'

When filename starts with the pattern do not use quotes in at the start: yyyyMMdd'_bla.nc'

fileNameEnsembleMemberIndexPattern

Use  ?  to indicate the position of the ensemble member index in the filename. For example 

<fileNameEnsembleMemberIndexPattern>cosmo_???.dat</fileNameEnsembleMemberIndexPattern> 

Litteral parts that contain unrelated characters can be also replaced with a dummy character. For example

<fileNameEnsembleMemberIndexPattern>xxxxxx???.dat</fileNameEnsembleMemberIndexPattern> 

This is useful if the filename keeps changing  , for example because of the presence of a date in the filename

fileNameLocationIdPattern

Use the filename, or part of it, as the locationId for the TimeSeries that will be imported, using Regular Expressions. When a match of the pattern in the filename is found, this will overrule the location Id for the timeseries being imported.

examples

...

for forecasts spread over several files which should be imported as one forecast i.e. a group of files, in particular if each file is added to the import folder at a different time.

The filenames must contain the forecast date and time. The pattern to define a group is specified in fileNameDateTimePattern, with the expected number of files for one forecast in numberOfFiles. The files will be imported only when all expected files are in the import folder. A time span also has to be specified in waitingTime to define when to consider the forecast as failed and move it to the failed folder. If more than the waiting time has passed between the forecast time and T0, the forecast is considered failed and the files are moved to the failed folder.

This option can be used in combination with the importTriggeringFile option.

For example, for the following configuration:

Code Block
languagexml
<groupImportPattern>
	<numberOfFiles>6</numberOfFiles>
    <fileNameDateTimePattern>'timeseries_'MMddHHmm'_?'</fileNameDateTimePattern>
    <waitingTime unit="day" multiplier="1"/>
</groupImportPattern>

with the import folder containing the files: 

  • timeseries_01011200_1
  • timeseries_01011200_2
  • timeseries_01011200_3
  • timeseries_01011200_4
  • timeseries_01011200_5
  • timeseries_01011200_6
  • timeseries_01021200_1

The first six files will be imported because six files are matching the pattern "timeseries_01011200_?". But the last file will not be imported yet because it is the only file matching the pattern "timeseries_01021200_?" .

fileNamePatternFilter

Option to filter files that need to be imported based on their filename. This can be useful if there are files in the import folder that should not be imported:

e.g *.xml to skip non xml files.

Another usecase is when you are importing a large number of big files which FEWS cannot import all at once due to memory issues (eg reanalysis data). In this case you can use the T0 of the system to filter out files you want to import in one run:

eg: <fileNamePatternFilter>%TIME_ZERO(yyyy)%??????.nc</fileNamePatternFilter>

fileNameObservationDateTimePattern

'IMAGE_'yyyyMMdd_HHmmss'.jpg' This will overrule the observation time stored in the file, some grid formats don't contain the time at all, so for these files the pattern is required. Put the literal parts of the pattern between single quotes.  If the litteral parts contain unrelated numbers or an unrelated date, replace each digit with a ?. The ? should still be between the single quotes.

Example: T20190524_20190722 →  'T????????_'yyyyMMdd.

Currently supported only by import from OpenDAP.
fileNameForecastCreationDateTimePattern

'IMAGE_'yyyyMMdd_HHmmss'.jpg' This will overrule the forecast time stored in the file, some grid formats don't contain the forecast time at all, so for these files the pattern is required. Put the literal parts of the pattern between single quotes. When the file name also contains non forecast date times. Put this parts between ' and use ? wildcard. All forecast with the same forecast time belong to the same forecast, the "fileNameObservationDateTimePattern" is not required in that case.

Example: MSPM_<observation date time>_<forecast date time>.asc → 'MSPM_??????????????_'yyyyMMddHHmmss'.asc'

When filename starts with the pattern do not use quotes in at the start: yyyyMMdd'_bla.nc'

fileNameEnsembleMemberIndexPattern

Use  ?  to indicate the position of the ensemble member index in the filename. For example 

<fileNameEnsembleMemberIndexPattern>cosmo_???.dat</fileNameEnsembleMemberIndexPattern> 

Litteral parts that contain unrelated characters can be also replaced with a dummy character. For example

<fileNameEnsembleMemberIndexPattern>xxxxxx???.dat</fileNameEnsembleMemberIndexPattern> 

This is useful if the filename keeps changing  , for example because of the presence of a date in the filename

fileNameLocationIdPattern

Use the filename, or part of it, as the locationId for the TimeSeries that will be imported, using Regular Expressions. When a match of the pattern in the filename is found, this will overrule the location Id for the timeseries being imported.

examples

file namepatternlocation id
PObs-1234.txt[^-][-](.*).{4}1234
H1234.txt.{1}(.*).{4} 1234
BAFU-2021-PegelRadarSchacht.csv(.*)[-][^-]*BAFU-2021
Pegel-Radar-Schacht-BAFU2021.csv([^-]*)\.csvBAFU2021

You can test and build your expression with 

https://regex101.com

The location id is the "group 1" in the match information. Only one output is allowed.

fileNameParameterIdPattern

Use the filename, or part of it, as the parameterId for the TimeSeries that will be imported, using Regular Expressions. See description above for fileNameLocationIdPattern.

user

User name, required when importing from protected database connections or protected servers.

password

Password, required when importing from protected database connections or protected servers. Please use the Hex value for special characters (e.g. a @ must be specified as %40)

relativeViewPeriod

The relative period for which data should be imported. This period is relative to the time 0 of the run. When the start and end time are overrulable the user can specify the download length with the cold state time and forecast length in the manual forecast dialog. It is also possible to import data for an absolute period of time using the startDateTime and endDateTime elements.

startDateTime

Start date and time of the (absolute) period for which data should be imported. Start is inclusive. This dateTime is in the configured importTimeZone. It is also possible to import data for a relative period of time using the relativeViewPeriod element.

endDateTime

End date and time of the (absolute) period for which data should be imported. End is inclusive. This dateTime is in the configured importTimeZone. It is also possible to import data for a relative period of time using the relativeViewPeriod element.

onlyGaps 

Since 2022.02. Specifically designed for service imports.

This elements specifies that only for periods with gaps (missing data enclosed within non missing data) in the import time series data will be imported.

First all gaps of the import time series within the period for the import are detected, then the import is run for all periods of the gaps. 

A gap in any time series will result in the (re)import for all time series.

Non equidistant series will not be included for detection of gaps.

Unreliable non-missing values are not counted as gaps.

Missing values at the start and end of a configured period for a series are not counted as gaps.

This element needs to be combined with either a relativeViewPeriod or a startDateTime and endDateTime in the general part of the import config

You can test and build your expression with 

https://regex101.com

The location id is the "group 1" in the match information. Only one output is allowed.

fileNameParameterIdPattern

Use the filename, or part of it, as the parameterId for the TimeSeries that will be imported, using Regular Expressions. See description above for fileNameLocationIdPattern.

user

User name, required when importing from protected database connections or protected servers.

password

Password, required when importing from protected database connections or protected servers. Please use the Hex value for special characters (e.g. a @ must be specified as %40)

relativeViewPeriod

The relative period for which data should be imported. This period is relative to the time 0 of the run. When the start and end time are overrulable the user can specify the download length with the cold state time and forecast length in the manual forecast dialog. It is also possible to import data for an absolute period of time using the startDateTime and endDateTime elements.

startDateTime

Start date and time of the (absolute) period for which data should be imported. Start is inclusive. This dateTime is in the configured importTimeZone. It is also possible to import data for a relative period of time using the relativeViewPeriod element.

endDateTime

End date and time of the (absolute) period for which data should be imported. End is inclusive. This dateTime is in the configured importTimeZone. It is also possible to import data for a relative period of time using the relativeViewPeriod element.

externalForecastTimesSearchRelativePeriod & externalForecastTimesCardinalTimeStep

...

ID of the IdMap used to convert external parameterId's and locationId's to internal parameter and location Id's. Each of the formats specified will have a unique method of identifying the id in the external format. See section on configuration for Mapping Id's units and flags.

moduleInstanceAware

Since 2023.01. If value is set to true, data will only be imported if the module instance id specified in the config file matches the module instance id of the downloaded data.

useStandardName (since stable build 2012.02)

...

Enumeration of supported time zones. See appendix B for list of supported time zones.

gridStartPoint

...

Identification of the cell considered as the first cell of the grid.  This option should only be used if the NetCDF file is not CF compliant and contains insufficient metadata with info of the grid, its orientation etc.

may be in the upper left corner or in the lower left corner. Enumeration of options include :

  • NW : for upper left
  • SW : for lower left : for lower left
  • NE : for  upper right  

  • SE :  for lower right 

gridStartPoint is supported by the import type netcdf-cf_grid, matroos_netcdfmapseries , grib1, grib2 and cemig
Options  NE and SE are supported only by NETCDF-CF_GRID

logErrorsAsWarnings and logErrorsAsWarningsToFileOnly

...

Skips the first n lines of a ASCII file (like CSV). Error is logged when this option is configured for a binary file.

validate

Option to allow validation of the import files against the template, i.e.  xml-schema. If there is no template available, this option wil be ignored.

charset

Since 2017.01 it is possible to explicitly configure the charset for text file imports, like the generalCsv import.

...

  • locationId : Id of the location for which to apply the delay.
  • locationSetId : Id of the location for witch to apply the delay.
  • parameterId : Id for the parameter for which to apply the delay.
  • timeUnit : Specification of time units the delay is defined in.
  • multiplier  : Integer number of time units for the delay.

For example, to specify a delay of 2 hours, enter "hour" as unit and 2 as multiplier, to specify a delay of 2 hours, enter "hour" as unit and 2 as multiplier.

Note: You cannot use this option for timeseries of type external forecasting because doing so would require that the same delay is also applied to the forecast time.

startTimeShift

Specification of a shift to apply to the start time of a data series to be imported as external forecasting. This is required when the time value of the first data point is not the same as the start time of the forecast. This may be the case in for example external precipitation values, where the first value given is the accumulative precipitation for the first time step. The start time of the forecast is then one time unit earlier than the first data point in the series. Multiple entries may exist.

...

For some data formats an external unit is not defined in the file to be imported. This elements The element <externUnit> allows the unit to be specified explicitly. This unit , which is then used in possible to find the corresponding unit conversions as configured in a UnitConversionsFiles (see 02 Unit Conversions).

Code Block
languagexml
<externUnit parameterId="P.nwp.fcst" unit="mm" cumulativeSum="false"/>

Attributes of <externUnit>:Attributes;

  • parameterId: Id of the parameter for which a unit is specified. This is the internal parameter Id.
  • unit: specification of unit. This unit must be available in the UnitConversions specified in the unitConversionsId element.
  • cumulativeSum:  if this option is set to "true", then it is possible to receive the parameters as cumulative sums. Cumulative sums are partial sums of a given sequence of numbers. For example, if the sequence is:  {a, b, c, d, ...}, then the cumulative sums are: a, a+b, a+b+c, a+b+c+d, .... . The import module will then calculate the values for the individual time steps {a, b, c, d, ...} by subtracting the cumulative sum of the previous cumulative sum value.
  • cumulativeMean: similar to cumulativeSum, if cumulativeMean is set to "true", then it is possible to receive the parameters as cumulative means. Cumulative means are partial means of a given sequence of numbers. Therefore, for a given sequence of numbers s{x1,x2,...,xn,xn+1} cumulative means are calculated as follows: CM (xn+1) = [xn+1 + n * CM (xn)] ⁄ n+1. The import module will then calculate the values for the individual time steps {a, b, c, d, ...} by multiplying the cumulative mean with the amount of time steps already processed and then subtracting of the previous value that has been also multiplied with the amount of timestep processed minus 1.

...