Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

...

Principal components analysis (PCA) is used when a data set contains many time series (dimensions), and the dimensions need to be reduced, while retaining the most significant relationships between the time series. Reducing the number of dimensions reduces the data set size and removes unrelated variability. The implementation in FEWS allows you to derive a relation between independent timeseries (e.g. observations) and a dependent timeseries (e.g. a simulated timeseries).

Using historical timeseries, the The PCA regression transformation produces a linear regression equation and a root mean square error (RMSE). The PCA linear regression equation produces This equation is applied with the current observations to compute an estimate of a parameter, given some combination of the input time series. The RMSE is a measure of dispersion around the regression linethe (simulated) parameter, as well as its associated RMSE.

The PCA regression transformation was developed to update basin snow models when they drift away from realistic output. Snow updating uses historic and current snow water equivalence (SWE). Historic and current data come from monitoring stations within or near the basin, and from simulations of SWE in the basins. Historic observed data are used for PCA for a basin, and can potentially include time series from many monitoring stations. Current data are current daily SWE values, and are also either simulated (modelled basin) or observed (from monitoring stations within or near the basin). PCA finds the strongest underlying relationships between the historic observed station time series, and produces a linear equation. Current SWE values can then be input into the equation, and a PCA estimate of current basin SWE is produced. 

Input/output timeseries

In this function four nonequidistant input time series must be identified:
1. historicalObserved
2. historicalSimulated
3. currentObserved
4. currentSimulated
In the snow updating use of the PCA regression transformation, these time series are subsamples of a daily time series to produce one data point per month. See the dayMonth sample Wiki entry for page for more details.

In this function two output time series must be identified:
1. A time series with the PCA-estimated parameter value calculated by the algorithm
2. A time series with the associated RMSE calculated by the algorithm

...

The BPA FEWS gap handling technique uses all of the available data, resulting in a longer dataset (gray highlighting).

Table 1: A graphical demonstration of the BPA FEWS gap handling technique

Image Added

b) PCA transformation
i) Data Preprocessing
Before PCA calculations take place, the data set may need to be normalized and standardized (ex. where two datasets have very different means, standard deviations, or are not normally distributed) . However, no one type of preprocessing is appropriate for all time series. Therefore, to automatically assess which preprocessing type produces the 'best' results, the FEWS PCA algorithm performs a variety of preprocessing techniques. The user is presented with the result with the lowest RMSE.

...

Code Block
<!--this is an example of a PCA transformation module configuration-->

<?xml version="1.0" encoding="UTF-8"?>
<transformationModule xmlns="http://www.wldelft.nl/fews" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.wldelft.nl/fews http://fews.wldelft.nl/schemas/version1.0/transformationModule.xsd" version="1.0">

<!--declare your input time series-->
	<variable>
		<variableId>Obs_hist</variableId>
		<timeSeriesSet>
			<moduleInstanceId>DayMonthSampleSNWE_SnowAnalysis</moduleInstanceId>
			<valueType>scalar</valueType>
			<parameterId>SNWE</parameterId>
			<locationId>2A16</locationId>
			<locationId>2A18</locationId>
			<locationId>2A21</locationId>
			<locationId>2A22</locationId>
			<locationId>2A23</locationId>
			<locationId>2A25</locationId>
			<timeSeriesType>simulated forecasting</timeSeriesType>
			<timeStep unit="nonequidistant"/>
			<relativeViewPeriod unit="hour" start="-240" startOverrulable="true" end="0"/>
			<readWriteMode>read only</readWriteMode>
		</timeSeriesSet>
	</variable>
	<variable>
		<variableId>Sim_hist</variableId>
		<timeSeriesSet>
			<moduleInstanceId>DayMonthSampleSWE_SnowAnalysis</moduleInstanceId>
			<valueType>scalar</valueType>
			<parameterId>SWE</parameterId>
			<locationId>MCDQ2IL</locationId>
			<timeSeriesType>simulated forecasting</timeSeriesType>
			<timeStep unit="nonequidistant"/>
			<relativeViewPeriod unit="hour" start="-240" startOverrulable="true" end="0"/>
			<readWriteMode>read only</readWriteMode>
		</timeSeriesSet>
	</variable>
	<variable>
		<variableId>Obs_current</variableId>
		<timeSeriesSet>
			<moduleInstanceId>DayMonthSampleSNWE_SnowAnalysis</moduleInstanceId>
			<valueType>scalar</valueType>
			<parameterId>SNWE</parameterId>
			<locationId>2A16</locationId>
			<locationId>2A18</locationId>
			<locationId>2A21</locationId>
			<locationId>2A22</locationId>
			<locationId>2A23</locationId>
			<locationId>2A25</locationId>
			<timeSeriesType>simulated forecasting</timeSeriesType>
			<timeStep unit="nonequidistant"/>
			<relativeViewPeriod unit="day" start="-2" end="2"/>
			<readWriteMode>read only</readWriteMode>
		</timeSeriesSet>
	</variable>
	<variable>
		<variableId>Current_sim</variableId>
		<timeSeriesSet>
			<moduleInstanceId>DayMonthSampleSWE_SnowAnalysis</moduleInstanceId>
			<valueType>scalar</valueType>
			<parameterId>SWE</parameterId>
			<locationId>MCDQ2IL</locationId>
			<timeSeriesType>simulated forecasting</timeSeriesType>
			<timeStep unit="nonequidistant"/>
			<relativeViewPeriod unit="day" start="-2" end="2"/>
			<readWriteMode>read only</readWriteMode>
		</timeSeriesSet>
	</variable>

<!--declare your output time series-->
		<variable>
		<variableId>PCA_swe</variableId>
		<timeSeriesSet>
			<moduleInstanceId>PCA_MCDQ2IL_RMSE_SnowAnalysis</moduleInstanceId>
			<valueType>scalar</valueType>
			<parameterId>SWE</parameterId>
			<qualifierId>pca</qualifierId>
			<locationId>MCDQ2IL</locationId>
			<timeSeriesType>simulated forecasting</timeSeriesType>
			<timeStep unit="day" multiplier="1"/>
			<relativeViewPeriod unit="day" start="-2" end="2"/>
			<readWriteMode>add originals</readWriteMode>
			<ensembleId>main</ensembleId>
		</timeSeriesSet>
	</variable>
	<variable>
		<variableId>PCA_rmse</variableId>
		<timeSeriesSet>
			<moduleInstanceId>PCA_MCDQ2IL_RMSE_SnowAnalysis</moduleInstanceId>
			<valueType>scalar</valueType>
			<parameterId>SWE</parameterId>
			<qualifierId>rmse</qualifierId>
			<locationId>MCDQ2IL</locationId>
			<timeSeriesType>simulated forecasting</timeSeriesType>
			<timeStep unit="day" multiplier="1"/>
			<relativeViewPeriod unit="day" start="-2" end="2"/>
			<readWriteMode>add originals</readWriteMode>
			<ensembleId>main</ensembleId>
		</timeSeriesSet>
	</variable>
	<!--perform the PCA calculation-->
	<transformation id="PCA">

		<regression>
			<principalComponentAnalysis>
				<historicalObserved>
					<variableId>Obs_hist</variableId>
				</historicalObserved>
				<historicalSimulated>
					<variableId>Sim_hist</variableId>
				</historicalSimulated>
				<currentObserved>
					<variableId>Obs_current</variableId>
				</currentObserved>
				<currentSimulated>
					<variableId>Current_sim</variableId>
				</currentSimulated>
				<enableCombinationAnalysis>true</enableCombinationAnalysis>
				<estimatedCurrentSimulated>
					<variableId>PCA_swe</variableId>
				</estimatedCurrentSimulated>
				<errorStatistics>
					<variableId>PCA_rmse</variableId>
				</errorStatistics>
			</principalComponentAnalysis>
		</regression>
	</transformation>
	
</transformationModule>

Config example: TimeSeriesDisplayConfig.xml
The following sample describes how the PCA snow updataing display is configured.
Note the use of the dayMonthSample function (DayMonth Sample)

Code Block
<sampleFunctions>
		<dayMonthSampleFunction/>
</sampleFunctions>
<statisticalFunctions>
	<statisticalFunction function="principalcomponentanalysisrmeprincipalcomponentanalysis">
		<observedParameterId>SNWE</observedParameterId>
		<simulatedParameterId>SWE</simulatedParameterId>
	</statisticalFunction>
</statisticalFunctions>

...