Configuration harvest tasks

In releases prior to FEWS 202401 it was recommended to have a full harvest task scheduled to run at least once a day. This task scanned the entire content of the archive data folder and added data sets to the catalogue which were not harvested yet.

In addition the full harvest task would check if there were data sets in the catalogue which were no longer present in the archive. After a full harvest run the catalogue would always be up-to-date.

The main downside of this approach was that this task would take time to complete.

From the release FEWS 202401 the harvest process is improved in the following way;

1) All the data exports to the archive also inform the archive now about which data sets were added to the archive. The archive will immediately harvest these newly added data sets so that become directly available in the catalogue.

2) Archive tasks which add data to the archive or remove data from the archive such as the archive amalgamate or the data removal task also make sure that the catalogue is immediately updated directly after they finished their task.

It therefore no longer necessary to run a full harvest task at a daily basis. You only need to run the harvest task after you have upgraded to a new FEWS version. Also when you have deleted data from the archive manually or by using your own scripts it is recommended to run the

full harvest task.

For the harvest tasks the following tasks are recommended for your configuration:

1) Full harvest task. Make sure that task is not scheduled. The task should only be run manually

2) Immediate harvest task. This task is run when new data is exported to the archive or when data is removed from the archive.

2) Incremental harvest task. This harvest task should be scheduled every 10 minutes. This task harvest the last 7 days in the archive. Normally new data should be harvested automatically by the Immediate harvester. However when the immediate harvest was not available

for any reason then the Incremental harvester will make sure that the data sets which not harvested immediately after the export are still harvested.

Page tree

Configuration harvest tasks