Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In releases prior to Delft-FEWS 202401 2024.01 it was recommended to have a full harvest task scheduled to run at least once a day. This task scanned the entire content of the archive data folder and added data sets to the catalogue which were not harvested yet.

...

The main downside of this approach was that this task would take a long time to complete.


From the release Delft-FEWS 202401 2024.01 the harvest process is improved in the following way;

...

It therefore no longer necessary to run a full harvest task at a daily basis. You only need to run the harvest task after you have upgraded to a new Delft-FEWS version (run clear catalogue first!)

...

2) Delete obsolete data from catalogue task. This task check if there is data in the catalogue which is already expiredremove from the file system.

32) Immediate harvest task. This task is run when new data is exported to the archive or when data is removed from the archive. This is scheduled every minute. If there newly added or deleted data the catalogue will be updated.

34) Incremental harvest task. This harvest task should be scheduled every 10 minutes. This hour. By default, this task harvests the last 7 days in the archive, but it is possible to configure a different period, e.g. <harvesterTimeSpan unit="day" multiplier="10"/>. Normally new data should be harvested automatically by the Immediate harvester. However when the immediate harvest was not available available for any reason then the Incremental harvester will make sure that the data sets will be harvested.

...

Code Block
languagexml
<archiveTasksSchedule xsi:schemaLocation="http://www.wldelft.nl/fews/archive http://fews.wldelft.nl/schemas//version1.0/archive-schemas/archiveTasksSchedule.xsd" xmlns="http://www.wldelft.nl/fews/archive" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
	<scheduledArchiveTask>
		<predefinedArchiveTask>incremental harvester internal catalogue</predefinedArchiveTask>
		<description>Incremental harvester (harvest the last 7 days)</description>
		<startTime>00:00:00</startTime>
		<endTime>23:59:00</endTime>
		<runInterval>1</runInterval>
		<active>true</active>
	</scheduledArchiveTask>
	<scheduledArchiveTask>
		<predefinedArchiveTask>immediate harvester task</predefinedArchiveTask>
		<description>Immediate harvester (harvest the newly added or deleted data sets immediatlyimmediately)</description>
		<startTime>00:00:00</startTime>
		<endTime>23:59:00</endTime>
		<runIntervalInSeconds>60</runIntervalInSeconds>
		<active>true</active>
	</scheduledArchiveTask>
	<scheduledArchiveTask>
		<predefinedArchiveTask>file sweeper</predefinedArchiveTask>
		<description>File sweeper</description>
		<startTime>00:00:00</startTime>
		<endTime>23:59:00</endTime>
		<runInterval>1</runInterval>
		<active>true</active>
	</scheduledArchiveTask>
	<!--<scheduledArchiveTask>
		<predefinedArchiveTask>historical events exporter</predefinedArchiveTask>
		<description>historical events exporter herstarted</description>
		<startTime>00:00:00</startTime>
		<endTime>23:00:00</endTime>
		<runInterval>1</runInterval>
		<active>true</active>
	</scheduledArchiveTask>-->
	<manualArchiveTask>
		<predefinedArchiveTask>harvester internal catalogue</predefinedArchiveTask>
		<description>Full harvest task (this task may take a long time to complete!)</description>
	</manualArchiveTask>
	<manualArchiveTask>
		<predefinedArchiveTask>clear internal catalogue</predefinedArchiveTask>
		<description>Clear internal catalogue</description>
	</manualArchiveTask>
	<manualArchiveTask>
		<predefinedArchiveTask>remove obsolete data from catalogue</predefinedArchiveTask>
		<description>Remove obsolete records from catalogue</description>
	</manualArchiveTask>
	<manualArchiveTask>
		<predefinedArchiveTask>remove netcdf cache files</predefinedArchiveTask>
		<description>Remove netcdf cache files</description>
	</manualArchiveTask>
	<manualArchiveTask>
		<predefinedArchiveTask>data management tool</predefinedArchiveTask>
		<description>Data management tool</description>
	</manualArchiveTask>
		<manualArchiveTask>
		<predefinedArchiveTask>remove data from catalogue<archive</predefinedArchiveTask>
		<description>Remove expired data from the archive</description>
	</manualArchiveTask>
</archiveTasksSchedule>