OAR capabilities

Oar is an open source batch scheduler which provides a simple and flexible exploitation of a cluster.

It manages resources of clusters as a traditional batch scheduler (as PBS / Torque / LSF / SGE / SLURM). In other words, it doesn't execute your job on the resources but manages them (reservation, acces granting) in order to allow you to connect to these resources and use them.

Its design is based on high level tools:

  • relational database engine MySQL or PostgreSQL,
  • scripting language Perl,
  • confinement system mechanism with the Linux cgroup features,
  • scalable exploiting tool Taktuk.

It is flexible enough to be suitable for production clusters and research experiments. It currently manages over than 5000 nodes and has executed more than 10 million jobs.

OAR advantages:

  • No dependence on specific computing libraries like MPI. We support all sort of parallel user applications.
  • CGROUP (2.6 linux kernel) integration which restricts the jobs on assigned resources (also useful to clean completely a job, even parallel jobs).
  • Can use the taktuk command (a large scale remote execution deployment): http://taktuk.gforge.inria.fr/.
  • Hierarchical resource requests (handle heterogeneous clusters).
  • Gantt scheduling (so you can visualize the internal scheduler decisions).
  • Full or partial time-sharing.
  • Checkpoint/resubmit.
  • Licences servers management support.
  • Best effort jobs : if another job wants the same resources then it is deleted automatically (useful to execute programs like SETI@home).
  • Environment deployment support (Kadeploy): http://kadeploy.imag.fr/.

Installing the OAR batch system

Security aspects in OAR

Like any other batch scheduler, OAR must become the users that submit jobs.

In OAR, security and user switching is managed by the “oardodo” command. It is a suid binary executable only by root and the oar group members. This is used to launch commands, scripts with the privileges of a particular user. When “oardodo” is called, it checks the value of the environment variable OARDO_BECOME_USER:

  • If this variable is empty, “oardodo” will execute the command with the privileges of the superuser (root).
  • Else, this variable contains the name of the user that will be used to execute the command.

Here are the scripts/modules where “oardodo” is called and which user is used during this call:

  • oarsub: this script is used for submitting jobs or reservations.
    • read user script
    • connection to job and launch user remote shell
    • SSH job keys management
  For all these functions, the user used in the OARDO_BECOME_USER variable is
  the user that submits the job.
  • pingchecker: this module is used to check resources health. Here, the user is root.
  • OAR::Modules::Judas: this module is used for logging and notifications:
    • user notification: email or command execution as user.
  • oarexec: executed on the first reserved node, oarexec executes the job prologue and initiate the job.
    • the “clean” method kills every oarsub connection process in superuser mode
    • “kill_children” method kills every child of the process in superuser mode
    • execution of a passive job in user mode
    • getting of the user shell in user mode
    • checkpointing in superuser mode
  • job_resource_manager: the job_resource_manager script is a perl script that oar server deploys on nodes to manage cgroups(cpusets), users, job keys…
    • cpuset creation and clean is executed in superuser mode
  • oarsh_shell: shell program used with the oarsh script. It adds its own process in the cgroup(cpuset) and launches the shell or the script of the user.
    • cpuset filling, “nice” and display management are executed as root.
    • TTY login is executed as user.
  • oarsh: oar's ssh wrapper to connect from node to node. It contains all the context variables usefull for this connection.
    • display management and connection with a user job key file are executed

as user.

playground/documentation_admin_2.5.3.txt · Last modified: 2013/11/05 13:44 by capitn
Recent changes RSS feed GNU Free Documentation License 1.3 Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki