Snakemake can also be used in combination with a High Performance Cluster (HPC). Snakemake can submit jobs and keep track of the running jobs. When there is a dependency between jobs, Snakemake waits for the previous jobs to finish before submitting.

See the following links with information about the execution on a cluster:

Snakemake manual

Demo


See also

See also the example project on Github: https://github.com/Deltares/FAIR-data-example-project


Instruction

Running Snakemake

To run snakemake on a cluster the following command can be used:

snakemake --cores 1 --cluster "qsub -q {cluster.q} -N {cluster.N} -M {cluster.M}" --jobs 4 --cluster-config config/cluster.yaml

where:

  • The flag `--cluster` the command to sumbit a job to a cluster. With the flag `--cluster-config` additional wildcards for the cluster can be set. In this case the queue (q), job name (N) and email (M)
  • The flag `–jobs` specifies  the maximum number of jobs.

The snakefile

In the snakefile it is possible to specify which rules are performed locally and which rules are submitted. This can be done with the keyword localrules:

localrules: create_sims, grid, locations

The config file

In a config file it is possible to specify information for submitting, for example the queue and job name. Both JSON and YAML formats are allowed. See below an example in the YAML  format:

__default__:
    q: normal-e5-c7
    N: swan
    M: test.test

Demo files

An example snakefile is given for a SWAN simulation:

import os

####SETTINGS####

localrules: create_sims, grid, locations # these rules are not submitted

## TODO this list also in the script create sims. How to access this list in a python script?
wind_speed_list = [30, 20]

win = True

####End result####

rule all:
input:
path_fig=expand("reports/U{wind_speed}.png", wind_speed=wind_speed_list),
rule analyse:
input:
path="data/4-output/U{wind_speed}/U{wind_speed}_p1.tab"
output:
path_fig="reports/U{wind_speed}.png"
script:
os.path.join('src','4-analyze','analyse.py')

rule run:
input:
output_bot=os.path.join('data','3-input','bed.bot'),
output_p1=os.path.join('data','3-input','p1.xyn'),
path_sims=expand("data/4-output/U{wind_speed}/U{wind_speed}.swn", wind_speed=wind_speed_list)
output:
res="data/4-output/U{wind_speed}/U{wind_speed}_p1.tab"
run:
if not win:
shell("data/4-output/U{wildcards.wind_speed}/run_SWAN.sh U{wildcards.wind_speed}")
else:
shell("cd data/4-output/U{wildcards.wind_speed} && copy U{wildcards.wind_speed}.swn INPUT && call ..\\..\\..\\bin\\swan_4131A_1_del_w64_i18_omp.exe ")

rule create_sims:
input:
bot=os.path.join('data','3-input','bed.bot'),
p1=os.path.join('data','3-input','p1.xyn'),
path_template = os.path.join('config','template')
output:
path_sims=expand("data/4-output/U{wind_speed}/U{wind_speed}.swn", wind_speed=wind_speed_list)
script:
os.path.join('src','1-prepare','create_sims.py')



rule grid:
input:
path_bot=os.path.join('data','1-external','bed.bot'),
path_fxw=os.path.join('data','1-external','obs.fxw'),
output:
output_bot=os.path.join('data','3-input','bed.bot'),
output_fxw=os.path.join('data','3-input','obs.fxw'),
script:
os.path.join('src','1-prepare','create_grid.py')


rule locations:
input:
path_p1=os.path.join('data','1-external','p1.xyn'),
path_p2=os.path.join('data','1-external','p2.xyn'),
output:
output_p1=os.path.join('data','3-input','p1.xyn'),
output_p2=os.path.join('data','3-input','p2.xyn'),
script:
os.path.join('src','1-prepare','create_output_locations.py')


This dummy project required the execution of the following python scripts before the simulations are submitted.

Create_grid.py

import os
import shutil

path_bot = snakemake.input.path_bot
path_fxw = snakemake.input.path_fxw
output_bot = snakemake.output.output_bot
output_fxw = snakemake.output.output_fxw

## In this demo only files are copied, but in reality bed and fxw are created based on external data.

shutil.copyfile(path_bot, output_bot)
shutil.copyfile(path_fxw, output_fxw)

create_output_locations.py

import os
import shutil

path_p1 = snakemake.input.path_p1
path_p2 = snakemake.input.path_p2
output_p1 = snakemake.output.output_p1
output_p2 = snakemake.output.output_p2

## In this demo only files are copied, but in reality bed and fxw are created based on external data.

shutil.copyfile(path_p1, output_p1)
shutil.copyfile(path_p2, output_p2) 

create_sims.py

import os
import numpy as np
from shutil import copyfile
from mako.template import Template

def render_template(data,template):
    '''
    Render template
    input:
        data: dictionary with variables to render
        template: template file
    '''
    with open(template) as f:
        template = f.read()
    ##
    tmpl = Template(text=template)

    with open(os.path.join(data['output_path'], '{}.{}'.format(data['fname'],data['extension']) ),mode='w') as f:
        f.write(tmpl.render(**data))




path_sims = snakemake.output.path_sims
path_template = snakemake.input.path_template

## if file exists overwrite
overwrite            = False
swan_template        = 'swan_template_final.swn'

# =============================================================================
#  wave conditions
# =============================================================================

wind_speed_list           = [30, 20]

# =============================================================================
#  create sims
# =============================================================================


## template file
template        = os.path.join(path_template,swan_template)

run_condition = {'water_level':[],
                 'wind_speed':[],
                 'wind_dir':[]}
## make combinations
for ii, item in enumerate(wind_speed_list):
    ## get conditions
    wind_speed  = item

    fname = 'U{}'.format(wind_speed)

    
    ## data for template
    data = {'output_path':os.path.dirname(path_sims[ii]),
            'extension':'swn',
            'variant':'A',
            'fname':fname,
            'wind_speed': wind_speed,
            'wind_dir': 180,
            'water_level': 0}
    
    ## skip if file already exists
    if os.path.exists(os.path.join(path_sims[ii], '{}.swn'.format(fname))) and overwrite:
        print('skipped {}'.format(fname))
        continue
    ## render template
    render_template(data,template)
    ## copy swan_run

    copyfile(os.path.join(path_template,'run_SWAN.sh'), os.path.join(os.path.dirname(path_sims[ii]),'run_SWAN.sh'))

  • No labels