General Csv

Overview

Imports time series data from files in CSV format with one header line containing a column heades of the time series:

The first line contains the column names (fields) in the csv file, the line is used to determine the field separator and to determine the names of the data columns
All other lines contain the date-time as field and the values for each time series.
Values between -1000.0 and -999.0 (inclusive) are regarded as missing values.

The CSV files can be supplied in a ZIP file.

Import type

The import type is generalCSV. There is no particular file extension required.

Example

Here is a simple example:

Time,Waterstand,Pomp-1 Born
04-05-2011 03:24,0.000000,-0.450000
04-05-2011 03:44,0.000000,-0.450000
04-05-2011 03:54,0.000000,-0.440000
.....

for configuration of the table layout see Table Layout

<?xml version="1.0" encoding="UTF-8"?>
<timeSeriesImportRun xmlns="http://www.wldelft.nl/fews" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.wldelft.nl/fews http://fews.wldelft.nl/schemas/version1.0/timeSeriesImportRun.xsd">
	<import>
		<general>
			<importType>generalCSV</importType>
			<folder>$IMPORT_FOLDER$/OBS</folder>
			<failedFolder>$IMPORT_FAILED_FOLDER$</failedFolder>
			<backupFolder>$IMPORT_BACKUP_FOLDER$/OBS</backupFolder>

			<table>
				<dateTimeColumn name="Time" pattern="dd-MM-yyyy HH:mm"/>
				<valueColumn unit="m" locationId="Bosscheveld" parameterId="H.meting" name="Waterstand"/>
				<valueColumn unit="min" locationId="Bosscheveld" parameterId="DT.meting" name="Pomp-1 Born"/>
			</table>
			<idMapId>IdImportOBS</idMapId>
			<unitConversionsId>ImportUnitConversions</unitConversionsId>
			<importTimeZone>
				<timeZoneOffset>+00:00</timeZoneOffset>
			</importTimeZone>
		</general>
	</import>
</timeSeriesImportRun>

Another example, skipFirstLinesCount

Example to read from CSV files where unfortunately the first contains a key instead of the column headers.
The CSV files look like:

[DATA]
Tagname,TimeStamp,Value,DataQuality
WDD.MEM_BER0001_01_LT01_MW,2010-04-01 12:21:00,-0.000,GOOD
WDD.MEM_BER0001_01_LT01_MW,2011-01-12 10:34:05,-0.001,GOOD
WDD.MEM_BER0001_01_LT01_MW,2011-01-12 10:35:00,-0.011,BAD
WDD.MEM_BER0001_01_LT01_MW,2011-01-12 10:36:00,-0.003,GOOD
WDD.MEM_BER0001_01_LT01_MW,2011-01-12 10:37:00,-0.000,GOOD
WDD.MEM_BER0001_01_LT01_MW,2011-01-12 10:38:00,-0.000,GOOD
WDD.MEM_BER0001_01_LT01_MW,2011-01-12 10:39:00,-0.000,GOOD

In this example the first line should be skipped, so skipFirstLinesCount = 1

<general>
	<importType>generalCSV</importType>
	<folder>$IMPORT_FOLDER_KETEN$</folder>
	<fileNamePatternFilter>*.csv</fileNamePatternFilter>
	<failedFolder>$IMPORT_FAILED_FOLDER_KETEN$</failedFolder>
	<backupFolder>$IMPORT_BACKUP_FOLDER_KETEN$</backupFolder>
	<table>
		<dateTimeColumn name="TimeStamp" pattern="yyyy-MM-dd HH:mm:ss"/>
		<locationColumn name="Tagname"/>
		<flagColumn name="DataQuality"/>
		<valueColumn name="Value" unit="SI"/>
	</table>
	<idMapId>IdKETEN</idMapId>
	<flagConversionsId>ImportKETENFlagConversions</flagConversionsId>
	<importTimeZone>
		<timeZoneOffset>+01:00</timeZoneOffset>
	</importTimeZone>
	<dataFeedId>CSV files</dataFeedId>
	<skipFirstLinesCount>1</skipFirstLinesCount>
</general>

Another example, no header at all (since 2012.02)

The CSV files look like:

WDD.MEM_BER0001_01_LT01_MW,A,2010-04-01 12:21:00,-0.000,GOOD
WDD.MEM_BER0001_01_LT01_MW,A,2011-01-12 10:34:05,-0.001,GOOD
WDD.MEM_BER0001_01_LT01_MW,A,2011-01-12 10:35:00,-0.011,BAD
WDD.MEM_BER0001_01_LT01_MW,A,2011-01-12 10:36:00,-0.003,GOOD
WDD.MEM_BER0001_01_LT01_MW,A,2011-01-12 10:37:00,-0.000,GOOD
WDD.MEM_BER0001_01_LT01_MW,A,2011-01-12 10:38:00,-0.000,GOOD
WDD.MEM_BER0001_01_LT01_MW,A,2011-01-12 10:39:00,-0.000,GOOD

<general>
	<importType>generalCSV</importType>
	<folder>$IMPORT_FOLDER_KETEN$</folder>
	<fileNamePatternFilter>*.csv</fileNamePatternFilter>
	<failedFolder>$IMPORT_FAILED_FOLDER_KETEN$</failedFolder>
	<backupFolder>$IMPORT_BACKUP_FOLDER_KETEN$</backupFolder>
	<table>
		<locationColumn/>
		<skippedColumn/>
		<dateTimeColumn pattern="yyyy-MM-dd HH:mm:ss"/>
		<valueColumn unit="SI"/>
		<flagColumn/>
	</table>
	<idMapId>IdKETEN</idMapId>
	<flagConversionsId>ImportKETENFlagConversions</flagConversionsId>
	<importTimeZone>
		<timeZoneOffset>+01:00</timeZoneOffset>
	</importTimeZone>
	<dataFeedId>CSV files</dataFeedId>
</general>

Importing comments only (since 2014.02)

For all imports a value column is required with the exception of importing comments. If the importer is configured without a valueColumn and a commentColumn was specified, the values will be set to missing values and the comments will be imported.

The CSV files look like:

WLOCATION;DATE;TIME;VALUE;TYPE
peilschaal_1;16-12-2014;04:01:22;29.20;Value
peilschaal AFW-2096M;17-12-2014;06:22:31;27.20;Waterstand
schorfkoelerbeek peilschaal_1;17-12-2014;06:24:17;27.20;Waterstand
schorfkoelerbeek peilschaal_1;17-12-2014;06:24:17;gemaaid;Opmerking

In this example the value column is marked as a commentColumn. There is no valueColumn configured for this import.

<import>
		<general>
			<importType>generalCSV</importType>
			<folder>%REGION_HOME%/Import/testImport</folder>
			<fileNamePatternFilter>*.csv</fileNamePatternFilter>
			<failedFolder>%REGION_HOME%/ImportFailed/testImport</failedFolder>
			<backupFolder>%REGION_HOME%/ImportBackup/testImport</backupFolder>
			<table>
				<locationColumn name="LOCATION"/>
				<dateColumn name="DATE" pattern="dd-MM-yyyy"/>
				<timeColumn name="TIME" pattern="HH:mm:ss"/>
				<commentColumn name="VALUE"/>
				<parameterColumn name="TYPE"/>
			</table>
....

To import only comments into FEWS, after the general section the timeSeries have to be configured that contain the comments. Suppose that in the “TYPE” column of the CSV file all comments are marked as “COMMENT”, then the parameter "COMMENT" can be used to import the comment lines. All comments will be stored into FEWS and the values will be set to missing.

Samples Import

For the import of ecological data based on samples, the general csv import is used extensively because there is a lot of information per timeseries.

Besides columns for location, parameter, date/time, value and unit, there is a column for sample id and multiple columns for qualifiers and sample properties.

The CSV files have just a single time step of a single time series per line, for example:

SMP_CODE;SMP_NAME;COORD_X_S;COORD_Y_S;ANAL_CODE;EMP_NR_S;PROD_CODE;METH_SAMP;METH_ANAL;EMP_NR_A;LOC_CODE;DATE_SMP;PAR_REF;waarde;eenheid;BRON;parameter_id;type
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Tubifex tubifex;2;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Tubificidae;2;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Hypania invalida;5;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Gammarus tigrinus;10;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Dicrotendipes pulsus;3;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Microtendipes chloris agg.;3;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Polypedilum;5;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Procladius;1;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Pseudochironomus prasinatus;1;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Stictochironomus;1;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Ceratopogonidae;3;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Caenis horaria;1;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Ecnomus tenellus;1;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Corbicula fluminea;6;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Dreissena polymorpha;268;n;MACEV2010;AANTL_MEA;gemeten
3110040601;Spiegelplas 4;133054;475032;MEA;WP_CAS_ZUY;OW;Van Veenhapper;SPV7030;WP_CAS_ZUY;SBP130;06-04-10 00:00;Potamopyrgus antipodarum;1;n;MACEV2010;AANTL_MEA;gemeten
3110040602;Spiegelplas 1;132546;476323;MEA;WP_CAS_ZUY;OW;macrofaunanet;SPV7030;WP_CAS_ZUY;SBP175;06-04-10 00:00;Piscicola;1;n;MACEV2010;AANTL_MEA;gemeten
3110040602;Spiegelplas 1;132546;476323;MEA;WP_CAS_ZUY;OW;macrofaunanet;SPV7030;WP_CAS_ZUY;SBP175;06-04-10 00:00;Tubificidae;1;n;MACEV2010;AANTL_MEA;gemeten
3110040602;Spiegelplas 1;132546;476323;MEA;WP_CAS_ZUY;OW;macrofaunanet;SPV7030;WP_CAS_ZUY;SBP175;06-04-10 00:00;Hygrobates longipalpis;1;n;MACEV2010;AANTL_MEA;gemeten
3110040602;Spiegelplas 1;132546;476323;MEA;WP_CAS_ZUY;OW;macrofaunanet;SPV7030;WP_CAS_ZUY;SBP175;06-04-10 00:00;Piona;1;n;MACEV2010;AANTL_MEA;gemeten
3110040602;Spiegelplas 1;132546;476323;MEA;WP_CAS_ZUY;OW;macrofaunanet;SPV7030;WP_CAS_ZUY;SBP175;06-04-10 00:00;Chelicorophium curvispinum;23;n;MACEV2010;AANTL_MEA;gemeten
3110040602;Spiegelplas 1;132546;476323;MEA;WP_CAS_ZUY;OW;macrofaunanet;SPV7030;WP_CAS_ZUY;SBP175;06-04-10 00:00;Gammarus tigrinus;22;n;MACEV2010;AANTL_MEA;gemeten
3110040602;Spiegelplas 1;132546;476323;MEA;WP_CAS_ZUY;OW;macrofaunanet;SPV7030;WP_CAS_ZUY;SBP175;06-04-10 00:00;Asellus aquaticus;3;n;MACEV2010;AANTL_MEA;gemeten
3110040602;Spiegelplas 1;132546;476323;MEA;WP_CAS_ZUY;OW;macrofaunanet;SPV7030;WP_CAS_ZUY;SBP175;06-04-10 00:00;Limnomysis benedeni;29;n;MACEV2010;AANTL_MEA;gemeten
3110040602;Spiegelplas 1;132546;476323;MEA;WP_CAS_ZUY;OW;macrofaunanet;SPV7030;WP_CAS_ZUY;SBP175;06-04-10 00:00;Ablabesmyia;1;n;MACEV2010;AANTL_MEA;gemeten
3110040602;Spiegelplas 1;132546;476323;MEA;WP_CAS_ZUY;OW;macrofaunanet;SPV7030;WP_CAS_ZUY;SBP175;06-04-10 00:00;Chironomus;5;n;MACEV2010;AANTL_MEA;gemeten
3110040602;Spiegelplas 1;132546;476323;MEA;WP_CAS_ZUY;OW;macrofaunanet;SPV7030;WP_CAS_ZUY;SBP175;06-04-10 00:00;Chironomus riparius agg.;9;n;MACEV2010;AANTL_MEA;gemeten
3110040602;Spiegelplas 1;132546;476323;MEA;WP_CAS_ZUY;OW;macrofaunanet;SPV7030;WP_CAS_ZUY;SBP175;06-04-10 00:00;Clinotanypus nervosus;4;n;MACEV2010;AANTL_MEA;gemeten
3110040602;Spiegelplas 1;132546;476323;MEA;WP_CAS_ZUY;OW;macrofaunanet;SPV7030;WP_CAS_ZUY;SBP175;06-04-10 00:00;Corynoneura scutellata agg.;2;n;MACEV2010;AANTL_MEA;gemeten

With corresponding import xml:

		<general>
			<importType>generalCSV</importType>
			<folder>$IMPORT_FOLDER$/HydroBiologie</folder>
			<table>
				<dateTimeColumn name="DATE_SMP" pattern="dd-MM-yyyy HH:mm"/>
				<locationColumn name="LOC_CODE"/>
				<unitColumn name="Eenheid"/>
				<parameterColumn name="PARAMETER_ID"/>
				<qualifierColumn name="PAR_REF" prefix="PAR_REF_"/>
				<qualifierColumn name="PROD_CODE" prefix="PROD_CODE_"/>
				<qualifierColumn name="ANAL_CODE" prefix="ANAL_CODE_"/>
				<qualifierColumn name="METH_ANAL" prefix="METH_ANAL_"/>
				<qualifierColumn name="METH_SAMP" prefix="METH_SAMP_"/>
				<qualifierColumn name="TYPE" prefix="TYPE_"/>
				<sampleIdColumn name="SMP_CODE"/>
				<propertyColumn name="SMP_NAME" key="SMP_NAME"/>
				<propertyColumn name="COORD_X_S" key="COORD_X_S"/>
				<propertyColumn name="COORD_Y_S" key="COORD_Y_S"/>
				<propertyColumn name="EMP_NR_S" key="EMP_NR_S"/>
				<propertyColumn name="EMP_NR_A" key="EMP_NR_A"/>
				<propertyColumn name="BRON" key="BRON"/>
				<valueColumn name="Waarde"/>
			</table>
			<idMapId>IdImport_HydroBiologie</idMapId>
			<dataFeedId>generalCSV</dataFeedId>
		</general>

Qualifier columns

Multiple qualifiers are used per time series divided over different colums. The column a qualifier originated from is important to keep track of. That is why a "prefix" attribute can be configured for a qualifier column. This prefix is added to the content of the column to form the external qualifier id.

With the use of id mapping, this prefix helps mapping to the correct qualifiers using a qualifier id function <qualifierIdFunction externalQualifierFunction="@ExternalQualifierId@"/> and qualifiers configured in a csv file:

	<csvFile>
		<file>par_ref_qualifiers.csv</file>
		<id>%qualifierId%</id>
		<name>%taxonname%</name>
		<attribute id="ExternalQualifierId">
			<text>PAR_REF_%taxonname%</text>
		</attribute>
	</csvFile>

The content out of the qualifier column with prefix "PAR_REF_" will in this way be used to map to a qualifier in par_ref_qualifiers.csv that has equal content as "taxonname".

In the same way the content out of the qualifier column with prefix "ANAL_CODE_" will be mapped to a qualifier in analyse_code.csv that has equal content as "qualifierId" using the next qualifier configuration:

	<csvFile>
		<file>analyse_code.csv</file>
		<id>%qualifierId%</id>
		<name>%qualifierName%</name>
		<attribute id="ExternalQualifierId">
			<text>ANAL_CODE_%qualifierId%</text>
		</attribute>
	</csvFile>

With the use of the above configuration it will be possible to export the qualifiers to the same columns again using the same qualifier if function <qualifierIdFunction externalQualifierFunction="@ExternalQualifierId@"/> and "prefix" attributes in the qualifier columns in the export xml:

		<general>
			<exportType>generalCsv</exportType>
			<folder>$EXPORT_FOLDER$/HydroBiologie</folder>
			<exportFileName>
				<name>ExportGeneralCsv.csv</name>
			</exportFileName>
			<table>
				<dateTimeColumn name="DATE_SMP" pattern="dd-MM-yyyy HH:mm"/>
				<locationColumn name="LOC_CODE"/>
				<unitColumn name="Eenheid"/>
				<parameterColumn name="PARAMETER_ID"/>
				<qualifierColumn name="PAR_REF" prefix="PAR_REF_"/>
				<qualifierColumn name="PROD_CODE" prefix="PROD_CODE_"/>
				<qualifierColumn name="ANAL_CODE" prefix="ANAL_CODE_"/>
				<qualifierColumn name="METH_ANAL" prefix="METH_ANAL_"/>
				<qualifierColumn name="METH_SAMP" prefix="METH_SAMP_"/>
				<qualifierColumn name="TYPE" prefix="TYPE_"/>
				<sampleIdColumn name="SMP_CODE"/>
				<propertyColumn name="SMP_NAME" key="SMP_NAME"/>
				<propertyColumn name="COORD_X_S" key="COORD_X_S"/>
				<propertyColumn name="COORD_Y_S" key="COORD_Y_S"/>
				<propertyColumn name="EMP_NR_S" key="EMP_NR_S"/>
				<propertyColumn name="EMP_NR_A" key="EMP_NR_A"/>
				<propertyColumn name="BRON" key="BRON"/>
				<valueColumn name="Waarde"/>
			</table>
			<idMapId>IdExport_HydroBiologie</idMapId>
		</general>

Property columns

Property columns can be used to define in which columns values can be found for specific sample properties where a "key" needs to be configured. These properties will be stored with the sample and not the time series itself. Because a sample will contain values for time series at different time steps and for each time step the properties are defined there is some redundancy. For each imported time step the property values will be checked wheter they are valid and are the same for all time steps in the sample.

Validation and logging

During the import the data for each time step is validated in different ways, failures will be logged as warning.

Property validation failure warnings:

Sample properties should be the same for all values with the same sample id
Property <key> with value <value> not present in current sample properties
Property <key> has different value <value> than in current sample properties <currentValue>
Property <key> missing, but present in current sample properties with value <value>
Sample property key <key> not found in SampleMetadataSchema.xml
Enumeration value <value> for property <key> not found in SampleMetadataSchema.xml

Location validation failure warnings when <logWarningsForUnmappableLocations>true</logWarningsForUnmappableLocations>:

No FEWS location for external id <locationId> (without id map)
External location <locationId> can not be mapped to FEWS location with id map <idMap>

Parameter validation failure warnings when <logWarningsForUnmappableParameters>true</logWarningsForUnmappableParameters>:

No FEWS parameter for external id <parameterId> (without id map)
External parameter <parameterId> can not be mapped to FEWS parameter with id map <idMap>

Qualifier validation failure warnings when <logWarningsForUnmappableQualifiers>true</logWarningsForUnmappableQualifiers>:

No FEWS qualifier for external id <qualifierId> (without id map)
External qualifier <qualifierId> can not be mapped to FEWS qualifier with id map <idMap>

Because sample data files can contain many lines of data each consisting of many properties and qualifiers lots of validation failre warnings can be logged. A maximum amount of log warnings can be configured to overrule the default of 5 <maxLogWarnings>100000</maxLogWarnings>. It is also possible to write all warnings to a separate file (instead of the FEWS log) in the failed folder with identical name to the import file but ending with .log by configuring <logWarningsToSeparateFile>true</logWarningsToSeparateFile>. This can be very useful to not have too many warnings in the FEWS database and have a separate file as feedback to the either the configurator of the FEWS system or the supplier of the import file.

Details of the import format

If the first line contains a comma, the decimal separator is taken to be a period (.), otherwise it is supposed to be a semicolon (;) and the decimal separator is taken to be a comma. This way locale-specific CSV files are supported.

The field separator is either a comma or a semicolon. Tabs are not supported.

Page tree