Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
This document describes the technical details and discussions in relation to the deployment of the OpenEarth stack. 


h2. Operating System:

The deployment is supported on RedhatEL/CentOS. These linux distributions are quite conservative in their updates. You can deploy the same software to other OS'es and other distributions. Deploying on windows is discouraged because it does not have a proper package management. 


h3. Extra packages
The OpnEarth stack consists of a number of packages from different repositories. 
Most of the packages originate from the original distribution. Some of the packages are provided by others. When packages do not exist, we create them and make them publicly available. 

h4. Packages from other sources
{info}Check using [keychecker|https://fedorahosted.org/keychecker/]{info}
{pre}
From EPEL
geos
geos-devel
keychecker
epel-release
netcdf
netcdf-devel
hdf5
hdf5-devel
nginx
{pre}
h3. Machine organization
{pre}
VM1 - Development
VM2 - Testing
VM3 - Acceptance
VM4 - Production

VM2 through 4 can be separated, for example like this:
VM2.a Database (postgis)
VM2.b Tomcat (thredds/geoserver/geonetwork)
VM2.c Reverse proxy (nginx)
VM2.d Python apps (gunicorn)
for load balancing the servers can be duplicated.
{pre}

h3. VM1

Extra software installed
- git, for updating [dotfiles|https://github.com/SiggyF/dotfiles.git] and[.emacs.d|https://github.com/SiggyF/.emacs.d.git].
- emacs, for editing
- gcc, for building stuff (for example fpm)
- irb, rubygems ruby-devel, for downloading fpm
- gem install \--user-install fpm
- subversion (download sourcecode)
- rpm-build (for building rpms)
- bash-completion (for productivity)
- bzip2-devel, openssl-devel, zlib-devel, libffi-devel, readline-devel, ncurses-devel
- tomcat6-webapps, tomcat6-admin-webapps

h4. Issues:

- emacs is too old, my configuration expects emacs>=24, 23 is present.
- fpm package not found \-> install with user install
- yum search requires root access for RHN
- yum grouplist: No group data available for configured repositories \-> yum makecache
- Clock skew detected.  Your build may be incomplete. Happens on vm's where both vm and host adapt clock skew factor.
- strange character in terminal isntead of arrow in putty
{pre}


h4. Commands
{pre}
# build rpm for python27
fpm \-s dir \-t rpm \-n python27 \-v 2.7.5 opt/python2.7/bin opt/python2.7/lib opt/python2.7/share
{pre}

h3. Links and discussions related to RHEL/CentOS

On packaging:
- [https://github.com/jordansissel/fpm]
- [http://fedoraproject.org/wiki/RPMGroups]

On deployment:

There is some discussion on whether to use the system python or not.
There is no recent python in epel.
- pro: it is updated by system administrator
- con: it is old, possibly too old

- [http://hynek.me/talks/python-deployments/]
- [http://hynek.me/articles/python-deployment-anti-patterns/]
- [http://dan.bravender.us/2012/5/11/git-based_fabric_deploys_are_awesome.html]
- [http://hynek.me/articles/python-app-deployment-with-native-packages/]


h2. Web application

Applications are written in python. The applications are stored under *python/applications* in the openearthtools tree.
Sources can be transferred to the webserver by committing to the openearthtools trunk and updating on the webserver.

h2. Architecture

{pre}
webclient/browser<=>\{internet\} <1=DMZ=1>\[reverse proxy :80\]<1=N> \[wsgiserver :60xx\] <1=1> \[web application /srv/apps\]
{pre}

- webclient/browser: the client that does the request. Often a browser but can also be a matlab script, a user interface or wget/curl.
- reverse proxy: Sends the web request (for example get /app/controller/id to app running at port 60xx on some other or the same machine). The reverse proxy is also the place where you would do caching, static pages and references to static results such as /data/kml.
- web application: a python package (with a setup.py and often a production.ini that can be installed and has a wsgi entry point.
- wsgi server: a server that hosts wsgi applications (applications that adhere to the wsgi protocol)

h2. Reverse proxy

The reverse proxy function requires configuring url's and mapping these to other ports/urls.
Both mod_proxy in apache and nginx are logical choices. The nginx configuration is a bit easier.

h2. WSGI

There are many options for hosting wsgi applications. The most used options are:
- gunicorn
- uwsgi
- mod_wsgi

If mod_wsgi (an apache module) is used, a reverse proxy is not required. Using this approach the applications are loaded directly by the web server giving it a bit more work to do.
The uwsgi and gunicorn applications (and similar servers such as wsgiref, paster, twisted.web) run the wsgi server and provide it using a custom port.
For gunicorn each application runs on a different port, using uwsgi multiple applications can be loaded.
The gunicorn is currently used because its command line api is more stable.


h2. Web applications

The web applications are packaged on the development server and then deployed as a whole as an rpm package.


h2. Tomcat installation

After install tomcat6 and tomcat6-webapps and tomcat6-admin-webapps, also add tomcat to the runlevels:
{code}
chkconfig --level 2345 tomcat6 on
{code}

h2. Configuration
Part of installing the application. It is not always clear where the configuration should be maintained. Let's take the following examples:

* Thredds server:
 - Enable services, general
 - Add a logo, company/deployment specific
 - Add remote services to the catalog, deployment specific
 - Add data location, deployment specific
 - Set access control for restricted services, secret
* Python application
 - Add database credentials, secret
 - Add application to supervisor, general?

Currently the general settings are done in the rpm packaging. The other tasks should probably done in a provisioning tool. TODO: check if this can be done in satellite.






h2. Python installation

Both the web application and wsgiserver are written in python and need to be of the same version.

The python applications are written in python2.7 or higher. For some applications python 3 does not work yet. You can check the current status of python 3 libraries at [https://python3wos.appspot.com/].

For packages you can choose 3 approaches:
- Use the packages that come with the OS package manager
- Install  libraries, header files, compilers from the package manager, install python packages from pypi
- Use a python distribution
Or a combination of the above.


For linux distributions where python is not up to date (such as redhat/centos and debian stable) we advise to use [Anaconda|https://store.continuum.io/cshop/anaconda/].
This avoids having to manually build libraries that have difficult dependencies.
The following packages are difficult to build and available from anaconda:
\-opencv (cmake does not find dependencies properly, installs in non standard places, many dependencies cmake, pkgconfig, pkgconfig, apache-ant, zlib, bzip2, libpng, jpeg, jasper, tiff, ilmbase, openexr, ffmpeg, eigen3, openni, libdc1394, qt4-mac, tbb)
\-matplotlib (dependencies freetype, libpng, py27-dateutil, py27-tz, py27-numpy, py27-pygtk, py27-pyqt4, `texlive, ghostscript)

The following packages are also difficult to build and not available through anaconda
\-vtk (cmake requires a lot of custom flags, does not build automatically, tvtk parses vtk headers, which can break)
\-mapnik (dependencies boost, icu, libpng, jpeg, tiff, zlib, freetype, libxml2, proj, python27, cairo, py27-cairo, gdal, curl, postgresql92, sqlite3)

\-gdal/ogr/osr (many dependencies, zlib, libpng, tiff, libgeotiff, jpeg, giflib, proj, lzma, geos, curl, hdf5-18, netcdf, openjpeg15, xercesc, expat, python27, py27-numpy, postgresql92, mysql5, sqlite3, spatialite, unixODBC)

Up to date packages for RHEL for gdal and its dependencies can be found here:
[http://elgis.argeo.org/repos/testing/6/elgis/i386/]

h2. Sandboxing/versioning

You often want to keep the python version and module versions fixed per application. As one application may require different package versions than others. This can be done using virtualenv or buildout. Virtualenvs can be stored in /opt/envs/. Buildouts, if used, are stored in /srv/apps/app/\[bin/eggs\].

h2. Running/restarting

Sometimes servers crash,leak, break. To restart applications automatically several tools are available.
For now we'll use supervisor.

h2 Reverse proxy list for Cross Origin checks

{pre}
'pdokviewer.pdok.nl', 'www.nationaalgeoregister.nl', 'mesonet.agron.iastate.edu', 'gis.opentraces.org', 'maps.warwickshire.gov.uk', 'suite.opengeo.org', 'gxp.opengeo.org', 'arcserve.lawr.ucdavis.edu', 'dinolab52.dinonet.nl', 'msgcpp-ogc-realtime.knmi.nl', 'geoservices.knmi.nl', 'www.kich.nl', 'open.mapquestapi.com', 'gis.kademo.nl', 'kademo.nl', 'www.dinoservices.nl','geodata.nationaalgeoregister.nl','www2.demis.nl', 'maps.opengeo.org', 'demo.opengeo.org','data.fao.org','suite.opengeo.org'
TODO: add list of opendap.deltares.nl, double check
{pre}



{info}Thanks to Reinout and Jack from Neelen and Schuurmans for suggestions and advise. {info}