Original question

I'm attempting to merge three precipitation grids that have non-overlapping periods of record. The grids do not share the same spatial definition and I'd like their data averaged over a subwatershed polygon LocationSet. 

My workflow is to interpolate the grids to separate temporary timeSeriesSets (all defined with the same subwatershed LocationSet), and merge these temporary timeSeriesSets to a single continuous timeSeriesSet written to the DB. Running this gives me the Warning message: "Existing value overwritten by missing" and all that appears is the data from the Timeseries that had the highest merge priority.

Should this be occurring? I'm expecting the merge (simple) not to overwrite data with "missings".

Are "missing" values explicitly assigned in the database? If so, can they be removed such that they do not overwrite data during the merge process? Or, is there a way to ignore "missing" during the merge process?

Implemented solution

Interpolate from grid to subbasin:

<transformation id="tranform_grid_basin_1">
    <interpolationSpatial>
        <average>
            <inputVariable>
                <variableId>inputGrid_1</variableId>
            </inputVariable>
            <outputVariable>
                <variableId>temporary_basin_LocationSet_1</variableId>
            </outputVariable>
        </average>
    </interpolationSpatial>
</transformation>
<transformation id="tranform_grid_basin_2">
    <interpolationSpatial>
        <average>
            <inputVariable>
                <variableId>inputGrid_2</variableId>
            </inputVariable>
            <outputVariable>
                <variableId>temporary_basin_LocationSet_2</variableId>
            </outputVariable>
        </average>
    </interpolationSpatial>
</transformation>

Subsequently merge:

<transformation id="basin_merge">
    <merge>
        <simple>            
            <inputVariable>
                <variableId>inputGrid_1</variableId>
            </inputVariable>
            <inputVariable>
                <variableId>inputGrid_2</variableId>
            </inputVariable>
            <outputVariable>
                <variableId>ouput_basins</variableId>
            </outputVariable>
        </simple>
    </merge>
</transformation>

This method works, however with the caveat: "some existing values overwritten with missings" and I there are multiple "WARN - Existing value overwritten by missing by..." warnings throughout.

Open question

I still question why an existing value be overwritten by "missing" during a merge? Is it wrong to consider a Merge transformation a means to create a continuous scalar timeseries from multiple disparate sources?
For the moment, I have a rather tedious work around: I built it incrementally, setting T0 to the end of the last dataset import, manually changing the xml file, run transformation, repeat. I'd like to know how to automate this.

See: 02 Data Handling in Delft-FEWS#02DataHandlinginDelftFEWS-ExternalHistoricaltimeseries


  • No labels