Materials presented at the PostgreSQL/PostGIS data workshop.

Installing a database

PostgreSQL/PostGIS Installation on your local machine

PostgreSQL/PostGIS installation on VM in Azure

Get sample data

A very simple data model is create based on data that is downloaded from the ICES website (http://ocean.ices.dk/HydChem/HydChem.aspx?plot=yes). The data model is based on downloaded CTD data. Minimize the data download (can take a long time, by using the the various filter options). This is a csv file. Note that you might need to fix some issues, e.g. ICES fails to insert a comma when an entire column is empty.

Datamodel and data insertion

In pgAdminIII open the query window and copy + paste the data model below. This code block contains a 1-table datamodel for training purposes. For real purposes, use the elaborate data model at the end of this wiki page.-- the csv fiule for this example has 10 columns

-- Cruise,Station,Type,yyyy-mm-ddThh:mm,Latitude [degrees_north],Longitude [degrees_east],Bot. Depth [m],PRES [db],TEMP [deg C],PSAL [psu]
-- the data type needs to be inferred from the data. Opening in Excel migght help in this stage.
-- 64HO,0307,B,1990-01-03T07:00,51.4220,3.5570,,0,6.30,31.030

-- from this we create a table. For easy load purposes, keep the column order the same as in the csv file. We add one extra column, the geometry coded in PostGIS .
CREATE table tempdata (
cruise character varying (50),
station character varying (50),
type character varying (2),
adate timestamp,
latitude float,
longitude float,
botdepth float,
pres float,
temp float,
psal float,
geom public.geometry)

-- Good practise is to add a primary key, but it is not needed for this tutorial.
-- Many clients expect a unique attribute, although it does not add new informationb
-- http://stackoverflow.com/questions/2463542/how-to-import-text-file-to-table-with-primary-key-as-auto-increment
   id SERIAL not null,
   PRIMARY KEY (id))
-- or do that afterwards
ALTER TABLE tempdata ADD COLUMN id SERIAL PRIMARY KEY
-- Now insert the data. We can insert the 1st 10 columns from csv, and leave the geom column open.
-- This is possible because by default a new column nullable, so it can remain undefined when not supplied.
-- either use PGAdmin to manually insert the csv. Right-mouse clik the column tempdata,
-- use Import .. and seelct the csv file. Do tick that the file has header.
-- Or use command line. Open a command line interface in the directory where the csv is located.
"c:\Program Files\PostgreSQL\9.5\bin\psql.exe" -h localhost -U postgres -d ices1 -p
5432
copy tempdata from x.csv with csv header delimiter ‘,’;
-- If needed, yyou could have rearranged the column order here by using
copy tempdata(Cruise,Station,Type,yyyy-mm-ddThh:mm,Latitude
[degrees_north],Longitude [degrees_east],Bot. Depth [m],PRES [db],TEMP [deg
C],PSAL [psu])  from x.csv with csv header delimiter ‘,’;

-- Now fill the geom column using the loaded lat and lon columns
-- You could have added the geom column here if you forgot it in the CRSTE statement
alter table tempdata add column geom public.geometry
-- calculate geom data
update tempdata set geom=ST_SetSRID((ST_POINT(longitude,latitude)),4326);
-- Check whether calculation geom data worked
select count(*)
from tempdata 
where geom is not null;

Data retrieval

Access with R

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL") ## load the PostgreSQL driver
con <- dbConnect(drv, dbname="ices1", host="localhost", user="postgres", password="yourpassword") ## Open a connection
rs <- dbSendQuery(con, "select tempdata.adate, tempdata.temp, tempdata.cruise, tempdata.psal from tempdata") ## Submits a statement and declare variable result set (rs)
df <- fetch(rs, n = -1) ## fetch all elements from the result set into a data.frame. n = number of records, n=-1 means all records
dim(df) ## Check number of records
head(df) ##show first lines
dbDisconnect(con) ## Closes the connection
dbUnloadDriver(drv) ## Frees all the resources on the driver

Access with matlab

Install OpenearthTools. This contains an open source database acces toolbox for those who do not have a database toolbox license.

help jdb
jdb_settings('dbtype','postgresql');
conn = jdb_connectdb('ices1','user','postgres','pass','123', 'port',5432,'database_toolbox',0);
tables = jdb_gettables(conn);
D = jdb_table2struct(conn,'tempdata');

Access with Python

For python code see the example below. There will also be a iPython notebook in our GIT repository.

import sqlalchemy
from sqlalchemy import create_engine
engine = create_engine("postgresql://postgres:password@localhost:5433/ices1",echo=False)
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import MetaData, Column, Integer, String, Float, Table
from geoalchemy2 import Geometry

Base = declarative_base()

metadata = MetaData(bind=engine)

# ORM expects a primary key column.
# That is not there so we fake one.
# However, when querying session.query(TempData) it will also implicitly request the column id.
# So we can only request all columns named explicitly
# query(TempData.adate,TempData.temp,TempData.cruise,TempData.psal)

class TempData(Base):
    __table__ = Table('tempdata', metadata,
                       Column("id", Integer, primary_key=True),
                      autoload=True,
                      extend_existing=True                      )

from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=engine)

session = Session()
from sqlalchemy import func

query = session.query(TempData.adate,TempData.temp,TempData.cruise,TempData.psal)
query.first()

 

Advanced data model

 

data model
ALTER SEQUENCE public.parameter_parameterid_seq OWNED BY public.parameter.parameterid;
 
CREATE SEQUENCE public.icesstation_stationid_seq;
 
CREATE TABLE public.icesstation (
                stationid INTEGER NOT NULL DEFAULT nextval('public.icesstation_stationid_seq'),
                name VARCHAR(50) NOT NULL,
                geom public.geometry,
                CONSTRAINT icesstation_pk PRIMARY KEY (stationid)
);
 
 
ALTER SEQUENCE public.icesstation_stationid_seq OWNED BY public.icesstation.stationid;
 
CREATE TABLE public.oceanography (
                oceanid INTEGER NOT NULL,
                stationid INTEGER NOT NULL,
                parameterid INTEGER,
                datetime TIMESTAMP,
                pressure DOUBLE PRECISION,
                bot_depth DOUBLE PRECISION NOT NULL,
                value DOUBLE PRECISION,
                CONSTRAINT oceanography_pk PRIMARY KEY (oceanid)
);
 
 
ALTER TABLE public.oceanography ADD CONSTRAINT parameter_oceanography_fk
FOREIGN KEY (parameterid)
REFERENCES public.parameter (parameterid)
ON DELETE NO ACTION
ON UPDATE NO ACTION
NOT DEFERRABLE;
 
ALTER TABLE public.oceanography ADD CONSTRAINT icesstation_oceanography_fk
FOREIGN KEY (stationid)
REFERENCES public.icesstation (stationid)
ON DELETE NO ACTION
ON UPDATE NO ACTION
NOT DEFERRABLE;



  • No labels