Seth Woolley's Man Viewer

Manual for queue - man 1 queue

([section] manual, -k keyword, -K [section] search, -f whatis)
man plain no title

queue(1,3)(1)                           GNU Queue                          queue(1,3)(1)



NAME
       queue(1,3) and qsh  - farm and batch-process jobs out on the local network

SYNOPSIS
       queue(1,3)  [-h  hostname|-H  hostname]  [-i|-q]  [-d  spooldir]  [-o|-p|-n]
       [-w|-r] -- command command.options

       qsh [-l ignored] [-d spooldir] [-o|-p|-n] [-w|-r] hostname command com-
       mand.options

DESCRIPTION
       This  documentation is no longer being maintained and may be inaccurate
       or incomplete.  The Info documentation is now the authoritative source.

       This  manual  page  documents GNU Queue load-balancing/batch-processing
       system and local rsh replacement.

       queue(1,3) with only a -- followed by a command defaults to immediate execu-
       tion (-i) wait for output (-w) and full-pty emulation (-p).

       The  defaults for qsh are a slightly different: no-pty emulation is the
       default, and a hostname argument is required. A plus (+) is  the  wild-
       card hostname; specifying + in(1,8) place of a valid hostname is the same as
       not using an -h or -H option with queue. qsh is  envisioned  as  a  rsh
       compatibility  mode  for use with software that expects a rsh-like syn-
       tax.  This is useful with some MPI implementations; see See section MPI
       in(1,8) the Info file.

       The options are:

       -h hostname

       --host hostname
              force queue(1,3) to run on hostname.

       -H hostname

       --robust-host hostname
              Run job on hostname if(3,n) it is up.

       -i|-q

       --immediate|--queue
              Shorthand for the (now spooldir) and queue(1,3) (queue(1,3) spooldir).

       [-d spooldir]

       [--spooldir spooldir]
              With  -q  option,  specifies  the  name  of the batch processing
              directory, e.g., mlab

       -o|-p|-n

       --half-pty|--full-pty|--no-pty
              Toggle between half-pty emulation, full-pty emulation (default),
              and the more efficient no-pty emulation.

       -w|-r

       --wait|--batch
              Toggle  between  wait  (stub  daemon;  default) and return (mail(1,8)
              batch) mode.

       -v

       --version
              Version

       --help
              List of options

       GNU Queue is a UNIX process network load-balancing system that features
       an  innovative  'stub  daemon'  mechanism which allows users(1,5) to control
       their remote jobs in(1,8) a nearly seamless and transparent fashion. When an
       interactive remote job is launched, such as say EMACS interfacing Alle-
       gro Lisp, a stub daemon runs on the remote end. By sending  signals  to
       the  remote  stub  - including hitting the suspend key - the process on
       the remote end may be controlled. Resuming the stub resumes the  remote
       job.  The user's environment is almost completely replicated, including
       not only environmental variables, but nice(1,2)  values,  rlimits,  terminal
       settings   are   all  replicated  on  the  remote  end.  Together  with
       MIT_MAGIC_COOKIE_1 (or xhost +) the system is X-windows transparent  as
       well,  provided  the  users(1,5)  local DISPLAY variable is set(7,n,1 builtins) to the fully
       qualified pathname of the local machine.

       One of the most appealing features of the stub system even with experi-
       enced  users(1,5)  is  that  asynchronous  job control of remote jobs by the
       shell is possible and intuitive. One simply runs the stub in(1,8) the  back-
       ground  under  the  local  shell;  the shell notifies the user when the
       remote job has a change in(1,8) status by monitoring the stub daemon.

       When the remote process has terminated, the stub returns the exit(3,n,1 builtins) value
       to  the shell; otherwise, the stub simulates a death by the same signal(2,7)
       as that which terminated or suspended the remote job. In this way, con-
       trol  of the remote process is intuitive even to novice users(1,5), as it is
       just like controlling a local job from the shell. Many of  my  original
       users(1,5)  had  to  be  reminded  that  their  jobs  were, in(1,8) fact, running
       remotely.

       In addition, Queue also features a more traditional  distributed  batch
       processing environment, with results returned to the user via email. In
       addition, traditional batch processing limitations  may  be  placed  on
       jobs  running  in(1,8) either environment (stub or with the email mechanism)
       such as suspension of jobs if(3,n) the system exceeds a certain  load(7,n)  aver-
       age, limits on CPU time(1,2,n), disk free requirements, limits on the times in(1,8)
       which jobs may run, etc. (These are documented in(1,8)  the  sample  profile
       file(1,n) included.)



       In  order  to  use  queue(1,3) to farm out jobs onto the network, the queued
       must be running on every host(1,5) in(1,8) your cluster, as defined in(1,8)  the  host(1,5)
       Access Control File (default: /usr/local/share/qhostsfile).

       Once  queued is running, jobs may normally be farmed out to other hosts
       withing the homogenous cluster.  For example, try something like  queue(1,3)
       -i -w -p  -- emacs -nw. You should be able to background and foreground
       the remote EMACS process from the local shell just as if(3,n) it  were  run-
       ning as a local copy.

       Another  example  command  is  queue(1,3)  -i -w -n -- hostname which should
       return the best host(1,5),  as controlled by options  in(1,8)  the  profile  file(1,n)
       (See below) to run a job on.

       The options on queue(1,3) need to be explained:

       -i  specifies  immediate  execution  mode,  placing  the job in(1,8) the now
       spool. This is the default. Alternatively, you may specify  either  the
       -q  option,  which  is  shorthand  for  the  wait  spool, or use the -d
       spooldir option to place the job under the control of the profile  file(1,n)
       in(1,8)  the spooldir subdirectory of the spool directory, which must previ-
       ously have been created by the Queue administrator.

       In any case, execution of the job will wait until it satisfies the con-
       ditions  of the profile file(1,n) for that particular spool directory, which
       may include waiting for a slot to become free.  This  method  of  batch
       processing  is  completely compatible with the stub mechanism, although
       it may disorient users(1,5) to use it in(1,8) this way as they may be unknowingly
       forced to wait until a slot on a remote machine becomes available.

       -w  activates the stub mechanism, which is the default.  The queue(1,3) stub
       process will terminate when the remote process terminates; you may send(2,n)
       signals  and suspend/resume the remote process by doing the same to the
       stub process. Standard input/output will be that of  the  'queue(1,3)'  stub
       process. -r deactivates the stub process; standard input/output will be
       via email back to the users(1,5); the queue(1,3) process will return immediately.

       -p  or -n specifies whether or not a virtual(5,8) tty(1,4) should be allocated at
       the remote end, or whether the system should merely use the more  effi-
       cient  socket(2,7,n)  mechanism.  Many interactive processes, such as EMACS or
       Matlab, require a virtual(5,8) tty(1,4) to  be  present,  so  the  -p  option  is
       required  for  these. Other processes, such as a simple hostname do not
       require a tty(1,4) and so may be run without the default -p. Note that queue(1,3)
       is  intelligent  and  will  override  the  -p option if(3,n) it detects both
       stdio/stdout have been re-directed to a non-terminal; this  feature  is
       useful  in(1,8)  facilitating system administration scripts that allow users(1,5)
       to execute jobs. [At some point we may wish to change the default to -p
       as the system automatically detects when -n will suffice.] Simple, non-
       interactive jobs such as  hostname  do  not  need  the  less(1,3)  efficient
       pty/tty(1,4)  mechanism  and  so  should  be  run with the -n option. The -n
       option is the default when queue(1,3) is invoked in(1,8) rsh  compatibility  mode
       with qsh.

       The  --  with  queue(1,3)  specifies  `end  of queue(1,3) options' and everything
       beyond this point is interpreted as the command,  or  arguments  to  be
       given  to  the command. Consequently, user options (i.e., when invoking
       queue(1,3) through a script front end, may be placed here):


        #!/bin/sh exec(3,n,1 builtins) queue(1,3) -i -w -p -- big_job $*

       or

        #!/bin/sh exec(3,n,1 builtins) queue(1,3) -q -w -p -d big_job_queue -- big_job  $*

       for example. This places queue(1,3) in(1,8) immediate mode following instructions
       in(1,8)  the  now  spool subdirectory (first example) or in(1,8) batch-processing
       mode into the big_job spool subdirectory, provided it has been  created
       by  the  administrator. In both cases, stubs are being used, which will
       not terminate until the big_job process terminates on the remote end.

       In both cases, pty/ttys will be allocated, unless  the  user  redirects
       both  the  standard  input  and  standard output of the simple invoking
       scripts. Invoking queue(1,3) through these scripts has the additional advan-
       tage  that the process name will be that of the script, clarifying what
       is the process is. For example, the  script  might  called  big_job  or
       big_job.remote,  causing queue(1,3) to appear this way in(1,8) the user's process
       list.

       queue(1,3) can be used for batch processing by using the -q -r  -n  options,
       e.g.,

        #!/bin/sh exec(3,n,1 builtins) queue(1,3) -q -r -n -d big_job -- big_job $*

       would  run big_job in(1,8) batch mode. -q and -d big_job options force Queue
       to follow instructions in(1,8) the big_job/profile file(1,n) under Queue's  spool
       directory and wait for the next available job slot. -r activates batch-
       processing mode, causing Queue to exit(3,n,1 builtins) immediately and  return  results
       (including stdout and stderr output) via email.

       The  final  option, -n, is the option to disable allocation of a pty on
       the remote end; it is unnecessary in(1,8) this case (as batch mode  disables
       ptys anyway) but is here to demonstrate how it might be used in(1,8) a -i -w
       -n or -q -w -n invocation.


       Under /usr/spool/queue(1,3) you may create  several  directories  for  batch
       jobs, each identified with the class of the batch job (e.g., big_job or
       small_job). You may then place restrictions on that class, such as max-
       imum  number  of  jobs running, or total CPU time(1,2,n), by placing a profile
       file(1,n) like this one in(1,8) that directory.

       However, the now queue(1,3) is mandatory; it is the directory used by the -i
       mode  (immediate  moe) of queue(1,3) to launch jobs over the network immedi-
       ately rather than as batch jobs.

       Specify that this queue(1,3) is turned on:

        exec(3,n,1 builtins) on

       The next two lines in(1,8) profile may be set(7,n,1 builtins) to  an  email  address  rather
       than  a  file(1,n);  the  leading  / identifies then as file(1,n) logs. Files now
       beginning with cf,of, or ef are ignored by the queued:

               mail(1,8)        /usr/local/com/queue(1,3)/now/mail_log        supervisor
       /usr/local/com/queue(1,3)/now/mail_log2

       Note  that  /usr/local/com/queue(1,3) is our spool directory, and now is the
       job batch directory for the special now queue(1,3) (run via the -i or  imme-
       diate-mode  flag to the queue(1,3) executable), so these files may reside in(1,8)
       the job batch directories.

       The pfactor command is used to control the likelihood of  a  job  being
       executed  on  a  given  machine. Typically, this is done in(1,8) conjunction
       with the host(1,5) command, which specifies that the option on the  rest  of
       the line be honored on that host(1,5) only.

       In  this  example, pfactor is set(7,n,1 builtins) to the relative MIPS of each machine,
       for example:

        host(1,5) fast_host pfactor 100 host(1,5) slow_host pfactor  50

       Where fast_host and slow_host  are  the  hostnames  of  the  respective
       machines.

       This  is  useful  for  controlling  load(7,n)  balancing. Each queue(1,3) on each
       machine reports back an `apparant load(7,n) average' calculated as follows:

       1-min load(7,n) average/ (( max(0, vmaxexec - maxexec) + 1)*pfactor)

       The machine with the lowest apparant load(7,n) average for that queue(1,3) is the
       one most likely to get the job.

       Consequently,  a  more powerful pfactor proportionally reduces the load(7,n)
       average that is reported back for this queue(1,3), indicating a more  power-
       ful system.

       Vmaxexec  is the ``apparant maximum'' number of jobs allowed to execute
       in(1,8) this queue(1,3), or simply equal to maxexec  if(3,n)  it  was  not  set.   The
       default  value  of these variables is large value treated by the system
       as infinity.

        host(1,5) fast_host vmaxexec 2 host(1,5) slow_host vmaxexec 1 maxexec 3

       The purpose of vmaxexec is to make the system appear  fully  loaded  at
       some  point  before  the maximum number of jobs are already running, so
       that the likelihood of the machine being used tapers off sharply  after
       vmaxexec slots are filled.

       Below  vmaxexec  jobs,  the  system  aggressively discriminates against
       hosts already running jobs in(1,8) this Queue.

       In job queues running above vmaxexec jobs, hosts appear more  equal  to
       the  system,  and  only  the load(7,n) average and pfactor is used to assign
       jobs. The theory here is that above vmaxexec jobs, the hosts are  fully
       saturated,  and  the load(7,n) average is a better indicator than the simple
       number of jobs running in(1,8) a job queue(1,3) of where to send(2,n) the next job.

       Thus, under lightly-loaded situations, the system  routes  jobs  around
       hosts  already  running  jobs in(1,8) this job queue. In more heavily loaded
       situations, load-averages and pfactors are used in(1,8) determining where to
       run jobs.

       Additional options in(1,8) profile


       exec(3,n,1 builtins)   on, off, or drain. Drain drains running jobs.


       minfree
              disk space on specified device must be at least this free.


       maxfree
              maximum number of jobs allowed to run in(1,8) this queue.


       loadsched
              1  minute  load(7,n)  average  must be below this value to launch new
              jobs.


       loadstop
              if(3,n) 1 minute load(7,n) average exceeds this, jobs in(1,8)  this  queue(1,3)  are
              suspended until it drops again.


       timesched
              Jobs are only scheduled during these times


       timestop
              Jobs running will be suspended outside of these times


       nice(1,2)   Running jobs are at least at this nice(1,2) value


       rlimitcpu
              maximum cpu(5,8,8 cpu-ldap) time(1,2,n) by a job in(1,8) this queue(1,3)


       rlimitdata
              maximum data memory size by a job


       rlimitstack
              maximum stack size


       rlimitfsize
              maximum fsize


       rlimitrss
              maximum resident portion size.


       rlimitcore
              maximum size of core dump


       These  options,  if(3,n)  present, will only override the user's values (via
       queue(1,3)) for these limits if(3,n) they are lower than what the  user  has  set(7,n,1 builtins)
       (or larger in(1,8) the case of nice(1,2)).

FILES
       These are the default file(1,n) paths. PREFIX is typically '/usr/local/bin'.

       PREFIX/share/qhostsfile       Host Access Control List File
       PREFIX/com/queue(1,3)              spool directory
       PREFIX/local/com/queue(1,3)/now    spool directory for immediate execution
       PREFIX/com/queue(1,3)/wait         spool directory for the '-q' shorthand
       SPOOLDIR/profile              control file(1,n) for the SPOOLDIR job queue(1,3)
       PREFIX/com/queue(1,3)/now/profile  control file(1,n) for immediate jobs
       PREFIX/var/queue_pid_hostname temporary file(1,n)

COPYING
       Copyright 1998, 1999 Werner G. Krebs

       Permission is granted to make and distribute verbatim  copies  of  this
       manpage  provided  the  copyright notice and this permission notice are
       preserved on all copies.

BUGS
       Bug reports to <bug-queue@gnu.org>

AUTHORS
       Werner G. Krebs is the primary author of GNU Queue.

       See Acknowledgements file(1,n) for a complete list of contributors.



GNU Queue Version 1.20.1-pre3                                         queue(1,3)(1)

References for this manual (incoming links)