03 September 2014

Parallelizing GNU #Make 4 in a #SLURM infrastructure/cluster

SLURM (https://computing.llnl.gov/linux/slurm/slurm.html) is The Simple Linux Utility for Resource Management . It's an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.

A patch for GNU Make version 3.81 is available as part of the SLURM distribution in https://github.com/SchedMD/slurm/blob/master/contribs/make.slurm.patch This patch will use SLURM to launch tasks across a job's current resource allocation.

The patch for Make 'just' wraps the command into srun: ( "srun is the command sending a parallel job on cluster managed by SLURM. )

Index: job.c
===================================================================
--- job.c (revision 321)
+++ job.c (working copy)
@@ -1959,6 +1959,22 @@
void
child_execute_job (int stdin_fd, int stdout_fd, char **argv, char **envp)
{
+/* PARALLEL JOB LAUNCH VIA SLURM */
+ if (getenv("SLURM_JOB_ID")) {
+ int i;
+ static char *argx[128];
+ argx[0] = "srun";
+ argx[1] = "-N1";
+ argx[2] = "-n1";
+ for (i=0; ((i<124)&&(argv[i])); i++) {
+ argx[i+3] = argv[i];
+ }
+ if (i<124) {
+ argx[i+3] = NULL;
+ argv = argx;
+ }
+ }
+/* END OF SLURM PATCH */
if (stdin_fd != 0)
(void) dup2 (stdin_fd, 0);
if (stdout_fd != 1)

GNU-Make version 4 was recently released. This new version comes with a number of improvements like GNU Guile integration, Loadable objects (see http://plindenbaum.blogspot.fr/2014/08/a-gnu-make-plug-in-for-illumina-fastqs.html ). It also allows to specify the default shell to be invoked (see http://plindenbaum.blogspot.fr/2014/01/parallelizing-rstats-using-make.html )

http://www.gnu.org/software/make/manual/make.html : The program used as the shell is taken from the variable SHELL. If this variable is not set in your makefile, the program /bin/sh is used as the shell. The argument(s) passed to the shell are taken from the variable .SHELLFLAGS. The default value of .SHELLFLAGS is -c normally, or -ec in POSIX-conforming mode.

So, if you want to parallelize GNU-Make with SLURM you can wrap the shell into srun using SHELL and .SHELLFLAGS. Here is an example, creating and concatenating 100 files containing the hostname:

ifdef SLURM_JOB_ID
SHELL=srun
.SHELLFLAGS= -N1 -n1  bash -c 
endif
NUMBERS=$(shell seq 1 100 )
TARGETS=  $(addsuffix .test,${NUMBERS} )


.PHONY:  all clean

define TEST

$(addsuffix .test,$(1)) : 
        echo -n  $(1) " " > $$@ && hostname >> $$@
        @sleep 5

endef


all: ${TARGETS}
        cat $^

$(foreach N,$(NUMBERS), $(eval $(call TEST,$(N) ) ) )

clean:
        rm -f ${TARGETS}

now invoke Make with SLURM and the option -j ( Allow -j N jobs at once ):

$ make -j 10
echo -n  1  " " > 1.test && hostname >> 1.test
echo -n  2  " " > 2.test && hostname >> 2.test
echo -n  3  " " > 3.test && hostname >> 3.test
echo -n  4  " " > 4.test && hostname >> 4.test
echo -n  5  " " > 5.test && hostname >> 5.test
echo -n  6  " " > 6.test && hostname >> 6.test
echo -n  7  " " > 7.test && hostname >> 7.test
(...)
echo -n  100  " " > 100.test && hostname >> 100.test
cat 1.test 2.test 3.test 4.test 5.test 6.test 7.test 8.test (...)
1  node004
2  node003
3  node001
4  node002
5  node002
6  node001
7  node002
8  node001
9  node001
10  node002
(...)
92  node004
93  node001
94  node001
95  node001
96  node001
97  node002
98  node001
99  node001
100  node001
That's it,
Pierre

1 comment:

Phil Lord said...

But how to get this working without modifying the Make file? I have been trying something like:

make -j SHELL=srun .SHELLFLAGS="/bin/sh"

But this fails with a recursive make call.