Materials presented at the PostgreSQL/PostGIS data workshop.
Installing a database
PostgreSQL/PostGIS Installation on your local machine
PostgreSQL/PostGIS installation on VM in Azure
Get sample data
A very simple data model is create based on data that is downloaded from the ICES website (http://ocean.ices.dk/HydChem/HydChem.aspx?plot=yes). The data model is based on downloaded CTD data. Minimize the data download (can take a long time, by using the the various filter options). This is a csv file. Note that you might need to fix some issues, e.g. ICES fails to insert a comma when an entire column is empty.
Datamodel and data insertion
In pgAdminIII open the query window and copy + paste the data model below. This code block contains a 1-table datamodel for training purposes. For real purposes, use the elaborate data model at the end of this wiki page.-- the csv fiule for this example has 10 columns
-- Cruise,Station,Type,yyyy-mm-ddThh:mm,Latitude [degrees_north],Longitude [degrees_east],Bot. Depth [m],PRES [db],TEMP [deg C],PSAL [psu] -- the data type needs to be inferred from the data. Opening in Excel migght help in this stage. -- 64HO,0307,B,1990-01-03T07:00,51.4220,3.5570,,0,6.30,31.030 -- from this we create a table. For easy load purposes, keep the column order the same as in the csv file. We add one extra column, the geometry coded in PostGIS . CREATE table tempdata ( cruise character varying (50), station character varying (50), type character varying (2), adate timestamp, latitude float, longitude float, botdepth float, pres float, temp float, psal float, geom public.geometry) -- Good practise is to add a primary key, but it is not needed for this tutorial. -- Many clients expect a unique attribute, although it does not add new informationb -- http://stackoverflow.com/questions/2463542/how-to-import-text-file-to-table-with-primary-key-as-auto-increment id SERIAL not null, PRIMARY KEY (id)) -- or do that afterwards ALTER TABLE tempdata ADD COLUMN id SERIAL PRIMARY KEY -- Now insert the data. We can insert the 1st 10 columns from csv, and leave the geom column open. -- This is possible because by default a new column nullable, so it can remain undefined when not supplied. -- either use PGAdmin to manually insert the csv. Right-mouse clik the column tempdata, -- use Import .. and seelct the csv file. Do tick that the file has header. -- Or use command line. Open a command line interface in the directory where the csv is located. "c:\Program Files\PostgreSQL\9.5\bin\psql.exe" -h localhost -U postgres -d ices1 -p 5432 copy tempdata from x.csv with csv header delimiter ‘,’; -- If needed, yyou could have rearranged the column order here by using copy tempdata(Cruise,Station,Type,yyyy-mm-ddThh:mm,Latitude [degrees_north],Longitude [degrees_east],Bot. Depth [m],PRES [db],TEMP [deg C],PSAL [psu]) from x.csv with csv header delimiter ‘,’; -- Now fill the geom column using the loaded lat and lon columns -- You could have added the geom column here if you forgot it in the CRSTE statement alter table tempdata add column geom public.geometry -- calculate geom data update tempdata set geom=ST_SetSRID((ST_POINT(longitude,latitude)),4326); -- Check whether calculation geom data worked select count(*) from tempdata where geom is not null;
Data retrieval
Access with R
library(RPostgreSQL) drv <- dbDriver("PostgreSQL") ## load the PostgreSQL driver con <- dbConnect(drv, dbname="ices1", host="localhost", user="postgres", password="yourpassword") ## Open a connection rs <- dbSendQuery(con, "select tempdata.adate, tempdata.temp, tempdata.cruise, tempdata.psal from tempdata") ## Submits a statement and declare variable result set (rs) df <- fetch(rs, n = -1) ## fetch all elements from the result set into a data.frame. n = number of records, n=-1 means all records dim(df) ## Check number of records head(df) ##show first lines dbDisconnect(con) ## Closes the connection dbUnloadDriver(drv) ## Frees all the resources on the driver
Access with matlab
Install OpenearthTools. This contains an open source database acces toolbox for those who do not have a database toolbox license.
help jdb jdb_settings('dbtype','postgresql'); conn = jdb_connectdb('ices1','user','postgres','pass','123', 'port',5432,'database_toolbox',0); tables = jdb_gettables(conn); D = jdb_table2struct(conn,'tempdata');
Access with Python
For python code see the example below. There will also be a iPython notebook in our GIT repository.
import sqlalchemy from sqlalchemy import create_engine engine = create_engine("postgresql://postgres:password@localhost:5433/ices1",echo=False) from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import MetaData, Column, Integer, String, Float, Table from geoalchemy2 import Geometry Base = declarative_base() metadata = MetaData(bind=engine) # ORM expects a primary key column. # That is not there so we fake one. # However, when querying session.query(TempData) it will also implicitly request the column id. # So we can only request all columns named explicitly # query(TempData.adate,TempData.temp,TempData.cruise,TempData.psal) class TempData(Base): __table__ = Table('tempdata', metadata, Column("id", Integer, primary_key=True), autoload=True, extend_existing=True ) from sqlalchemy.orm import sessionmaker Session = sessionmaker(bind=engine) session = Session() from sqlalchemy import func query = session.query(TempData.adate,TempData.temp,TempData.cruise,TempData.psal) query.first()
Advanced data model
ALTER SEQUENCE public.parameter_parameterid_seq OWNED BY public.parameter.parameterid; CREATE SEQUENCE public.icesstation_stationid_seq; CREATE TABLE public.icesstation ( stationid INTEGER NOT NULL DEFAULT nextval('public.icesstation_stationid_seq'), name VARCHAR(50) NOT NULL, geom public.geometry, CONSTRAINT icesstation_pk PRIMARY KEY (stationid) ); ALTER SEQUENCE public.icesstation_stationid_seq OWNED BY public.icesstation.stationid; CREATE TABLE public.oceanography ( oceanid INTEGER NOT NULL, stationid INTEGER NOT NULL, parameterid INTEGER, datetime TIMESTAMP, pressure DOUBLE PRECISION, bot_depth DOUBLE PRECISION NOT NULL, value DOUBLE PRECISION, CONSTRAINT oceanography_pk PRIMARY KEY (oceanid) ); ALTER TABLE public.oceanography ADD CONSTRAINT parameter_oceanography_fk FOREIGN KEY (parameterid) REFERENCES public.parameter (parameterid) ON DELETE NO ACTION ON UPDATE NO ACTION NOT DEFERRABLE; ALTER TABLE public.oceanography ADD CONSTRAINT icesstation_oceanography_fk FOREIGN KEY (stationid) REFERENCES public.icesstation (stationid) ON DELETE NO ACTION ON UPDATE NO ACTION NOT DEFERRABLE;