Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The harvester makes sure that the datasets on disk and the catalogue keep in synch. If for example a new dataset is added to the archive the harvester will detect this and create a metadata file in geonetwork for this dataset. The record id of the dataset in the catalogue is stored in the dataset on disk by creating an empty file with the format xxxxx.recordid. The xxxxx-part of the filename will define the record id in the catalogue. If a dataset has been removed by the administrator, the recordId file needs to be left in place. The harvester will then identify that a dataset has been removed and it will take out the record from the catalogue. The harvester will also remove the recordId file from disk and clean up the empty directory.

File sweeper

In some cases data files are exported twice to the archive. The system feeding the archive might have new data which overrides the previous data. In this data cases existing data files are overriden by new files. If the files which should be overriden are temporarily locked by archive for reading the system feeding archive should store the new file besides the existing file with the same name but with the extension new. The filesweeper will detect these files and will replace the original data file with the new one as soon as the filesweeper runs and the lock the data file is released.

...