OAR offers a powerful system letting you customize the way that jobs enter into queues (or are rejected from queues) called “admission rules”. An admission rule is a little perl script that you insert into the admission_rules SQL table of the OAR database. Here, you'll find some advanced and useful examples.
# Title : Cluster routing # Description : Send to the corresponding cluster my $cluster=$queue_name; if ($jobproperties ne ""){ $jobproperties = "($jobproperties) AND cluster = '".$cluster."'"; } else{ $jobproperties = "cluster = '".$cluster."'"; }
# Title : Cluster routing # Description : Send to the corresponding cluster and queue depending on the submission host use Sys::Hostname; my @h = split('\\.',hostname()); my $cluster; if ($h[0] eq "service0") { $cluster="nanostar"; print "[ADMISSION RULE] Routing to NANOSTAR cluster"; }else { $cluster="foehn"; print "[ADMISSION RULE] Routing to FOEHN cluster"; } if ($queue_name eq "default") { $queue_name=$cluster; } if ($jobproperties ne ""){ $jobproperties = "($jobproperties) AND cluster = '".$cluster."'"; } else{ $jobproperties = "cluster = '".$cluster."'"; }
Description : Users that are not members of a given group are automatically directed to the besteffort queue
my $GROUP="nanostar"; system("id -Gn $user |sed 's/ /\\\/g'|grep -w $GROUP >/dev/null"); if ($? != 0){ print("[ADMISSION RULE] !!!! WARNING !!!"); print("[ADMISSION RULE] !!!! AS AN EXTERNAL USER, YOU HAVE BEEN AUTOMATICALLY !!!"); print("[ADMISSION RULE] !!!! REDIRECTED TO THE BEST-EFFORT QUEUE !!!"); print("[ADMISSION RULE] !!!! YOUR JOB MAYBE KILLED WITHOUT NOTICE !!!"); $queue_name = "besteffort"; push (@{$type_list},"besteffort"); if ($jobproperties ne ""){ $jobproperties = "($jobproperties) and besteffort = \\'yes\\'"; }else{ $jobproperties = "besteffort = \\'YES\\'"; } $reservationField="None"; }
Description : Creates a mathlab job type that automatically assigns a mathlab licence
if (grep(/^mathlab$/, @{$type_list})){ print "[LICENCE ADMISSION RULE] Adding a mathlab licence to the query"; foreach my $mold (@{$ref_resource_list}){ push(@{$mold->[0]}, {'resources' => [{'resource'|=> 'licence','value' => '1'}], 'property' => 'type = \\'mathlab\\'} ); } }
Description : By default, an admission rule limits the walltime of interactiv jobs to 2 hours. This modified rule also set up a walltime for passive jobs.
my $max_interactive_walltime = OAR::IO::sql_to_duration("12:00:00"); # 7 days = 168 hours my $max_batch_walltime = OAR::IO::sql_to_duration("168:00:00"); foreach my $mold (@{$ref_resource_list}){ if (defined($mold->[1])){ if (($jobType eq "INTERACTIVE") and ($reservationField eq "None") and ($max_interactive_walltime < $mold->[1])){ print("[ADMISSION RULE] Walltime too big for an INTERACTIVE job so it is set to $max_interactive_walltime."); $mold->[1] = $max_interactive_walltime; }elsif ($max_batch_walltime < $mold->[1]){ print("[ADMISSION RULE] Walltime too big for a BATCH job so it is set to $max_batch_walltime."); $mold->[1] = $max_batch_walltime; } } }
Thanks to Nicolas Capit
Description : Rejects jobs asking for more than a cpu*walltime limit. Current limit is set to 384 hours (16 days of cpu-time)
Note: This rule is for an SMP host on which we only have a “pnodes” property (physical nodes) and “cpu” (only one core per cpu). It should be adapted for a more conventional distributed memory cluster having simple nodes with several cores per cpus.
my $cpu_walltime=iolib::sql_to_duration("384:00:00"); my $msg=""; foreach my $mold (@{$ref_resource_list}){ foreach my $r (@{$mold->[0]}){ my $cpus=0; my $pnodes=0; # Catch the cpu and pnode resources foreach my $resource (@{$r->{resources}}) { if ($resource->{resource} eq "cpu") { $cpus=$resource->{value}; } if ($resource->{resource} eq "pnode") { $pnodes=$resource->{value}; } } # Calculate the number of cpus if ($pnodes == 0 && $cpus == 0) { $cpus=1; } if ($pnodes != 0) { if ($cpus == 0) { $cpus=$pnodes*2;} else {$cpus=$pnodes*$cpus;} } # Reject if walltime*cpus is too big if ($cpus * $mold->[1] > $cpu_walltime) { $msg="\ [WALLTIME TOO BIG] The maximum allowed walltime for $cpus cpus is "; $msg.= $cpu_walltime / $cpus / 3600; $msg.= " hours."; die($msg); } } }
Description : Limits the maximum number of simultaneous jobs allowed for each user on the cluster. Default is 50 jobs maximum per user.<br> It is possible to specify users having unlimited jobs number in ~oar/unlimited_reservation.users file (on oar-server)<br> You can also configure the max_nb_jobs by setting your value in ~oar/max_jobs (on oar-server)<br> Note : Array jobs are also limited by this rule.
# Title : Limit the number of jobs per user to max_nb_jobs # Description : If user is not listed in unlimited users file, it checks if current number of jobs is well under $max_nb_jobs, which is defined in ~oar/max_jobs or is 50 by default my $unlimited=0; if (open(FILE, "< $ENV{home}/unlimited_reservation.users")) { while (<FILE>) { if (m/^\\s*$user\\s*$/m) { $unlimited=1; } } close(FILE); } if ($unlimited == 0) { my $max_nb_jobs = 50; if (open(FILE, "< $ENV{home}/max_jobs")) { while (<FILE>) { chomp; $max_nb_jobs=$_; } close(FILE); } my $nb_jobs = $dbh->selectrow_array( qq{ select count(job_id) FROM jobs WHERE job_user = ? AND (state = \\'Waiting\\' OR state = \\'Hold\\' OR state = \\'toLaunch\\' OR state = \\'toAckReservation\\' OR state = \\'Launching\\' OR state = \\'Running\\' OR state = \\'Suspended\\' OR state = \\'Resuming\\' OR state = \\'Finishing\\') }, undef, $user); if (($nb_jobs + $array_job_nb) > $max_nb_jobs) { die("[ADMISSION RULE] Error: you cannot have more than $max_nb_jobs submitted jobs at the same time."); } }
If you want to automatically assign a project to users submissions (replacing –project oarsub option), you simply have to set the $project variable to what you want inside an admission rule.
For example, if you defined with the command “oarproperty” a property “model” then you can enforce some properties constraints for some users.
# Title : Restricts the use of resources for some users # Description : think to change the user list in this admission rule my %allowed_users = ( "toto" => 1, "titi" => 1, "tutu" => 0 ); if (!defined($allowed_users{$user}) or ($allowed_users{$user} == 0)){ if ($jobproperties ne ""){ $jobproperties = "($jobproperties) AND model != 'bullx'"; }else{ $jobproperties = "model != 'bullx'"; } print("[ADMISSION RULE] Automatically add the constraint to not go on the bullx nodes"); }
# Title : Limit the number of interactive jobs per user # Description : Limit the number of interactive jobs per user my $max_interactive_jobs = 2; if (($jobType eq "INTERACTIVE") and ($reservationField eq "None")){ my $nb_jobs = $dbh->do(" SELECT job_id FROM jobs WHERE job_user = '$user' AND reservation = 'None' AND job_type = 'INTERACTIVE' AND (state = 'Waiting' OR state = 'Hold' OR state = 'toLaunch' OR state = 'toAckReservation' OR state = 'Launching' OR state = 'Running' OR state = 'Suspended' OR state = 'Resuming' OR state = 'Finishing') "); if ($nb_jobs >= $max_interactive_jobs){ die("You cannot have more than $max_interactive_jobs interactive jobs at a time."); } }
# Title : Infiniband user restrictions # Description : put the ib property restriction depending of the groups of the user if ((! grep(/^besteffort$/, @{$type_list})) and ($user ne "serviware")){ print("[ADMISSION RULE] Check on which Infiniband network you can go on..."); my ($user_name,$user_passwd,$user_uid,$user_gid,$user_quota,$user_comment,$user_gcos,$user_dir,$user_shell,$user_expire) = getpwnam($user); my ($primary_group,$primary_passwd,$primary_gid,$primary_members) = getgrgid($user_gid); my ($seiscope_name,$seiscope_passwd,$seiscope_gid,$seiscope_members) = getgrnam("seiscope"); my %seiscope_hash = map { $_ => 1 } split(/\\s+/,$seiscope_members); my ($globalseis_name,$globalseis_passwd,$globalseis_gid,$globalseis_members) = getgrnam("globalseis"); my %globalseis_hash = map { $_ => 1 } split(/\\s+/,$globalseis_members); my ($tohoku_name,$tohoku_passwd,$tohoku_gid,$tohoku_members) = getgrnam("tohoku"); my %tohoku_hash = map { $_ => 1 } split(/\\s+/,$tohoku_members); my $sql_str = "ib = \\'none\\'"; if (($primary_group eq "seiscope") or (defined($seiscope_hash{$user}))){ print("[ADMISSION RULE] You are in the group seiscope so you can go on the QDR Infiniband nodes"); $sql_str .= " OR ib = \\'QDR\\'"; } if (($primary_group eq "globalseis") or (defined($globalseis_hash{$user})) or ($primary_group eq "tohoku") or (defined($tohoku_hash{$user}))){ print("[ADMISSION RULE] You are in the group globalseis or tohoku so you can go on the DDR Infiniband nodes"); $sql_str .= " OR ib = \\'DDR\\'"; } if ($jobproperties ne ""){ $jobproperties = "($jobproperties) AND ($sql_str)"; }else{ $jobproperties = "$sql_str"; } }
When you play with admission rules, you can dump some data structures with YAML to have a readable output of the submission requests for example:
print "[DEBUG] Output of the resources query data structure:"; print YAML::Dump(@{$ref_resource_list});
See the NUMA topology optimization usecase
Description: The following is a set of admission rules that route on 3 different queues having different priorities. Some core number restrictions per queue are set up.
Queues creation:
oarnotify --add_queue short,9,oar_sched_gantt_with_timesharing_and_fairsharing oarnotify --add_queue medium,5,oar_sched_gantt_with_timesharing_and_fairsharing oarnotify --add_queue long,3,oar_sched_gantt_with_timesharing_and_fairsharing
Rules:
------ Rule : 20 # Title: Automatic routing into the short queue # Description: Short jobs are automatically routed into the short queue my $max_walltime="6:00:00"; my $walltime=0; # Search for the max walltime of the moldable jobs foreach my $mold (@{$ref_resource_list}){ foreach my $r (@{$mold->[0]}){ if ($mold->[1] > $walltime) { $walltime = $mold->[1]; } } } # Put into the short queue if the job is short if ($walltime <= OAR::IO::sql_to_duration($max_walltime) && !(grep(/^besteffort$/, @{$type_list}))) { print " [SHORT QUEUE] This job is routed into the short queue"; $queue_name="short"; } ------ Rule : 21 # Title: Automatic routing into the medium queue # Description: Medium jobs are automatically routed into the medium queue my $max_walltime="120:00:00"; my $min_walltime="6:00:00"; my $walltime=0; # Search for the max walltime of the moldable jobs foreach my $mold (@{$ref_resource_list}){ foreach my $r (@{$mold->[0]}){ if ($mold->[1] > $walltime) { $walltime = $mold->[1]; } } } # Put into the medium queue if the job is medium if ($walltime <= OAR::IO::sql_to_duration($max_walltime) && $walltime > OAR::IO::sql_to_duration($min_walltime) && !(grep(/^besteffort$/, @{$type_list}))) { print " [MEDIUM QUEUE] This job is routed into the medium queue"; $queue_name="medium"; } ------ Rule : 22 # Title: Automatic routing into the long queue # Description: Medium jobs are automatically routed into the medium queue my $max_walltime="360:00:00"; my $min_walltime="120:00:00"; my $walltime=0; # Search for the max walltime of the moldable jobs foreach my $mold (@{$ref_resource_list}){ foreach my $r (@{$mold->[0]}){ if ($mold->[1] > $walltime) { $walltime = $mold->[1]; } } } # Put into the long queue if the job is long if ($walltime > OAR::IO::sql_to_duration($min_walltime) && !(grep(/^besteffort$/, @{$type_list}))) { print " [LONG QUEUE] This job is routed into the long queue"; $queue_name="long"; } # Limit walltime of the "long" queue if ($queue_name eq "long"){ my $min_walltime="120:00:00"; my $max_walltime="360:00:00"; foreach my $mold (@{$ref_resource_list}){ foreach my $r (@{$mold->[0]}){ if ($mold->[1] > OAR::IO::sql_to_duration($max_walltime)) { print "\ [WALLTIME TOO BIG] The maximum allowed walltime for the long queue is $max_walltime"; exit(1); } if ($mold->[1] <= OAR::IO::sql_to_duration($min_walltime)) { print "\ [WALLTIME TOO SHORT] The minimum allowed walltime for the long queue is $min_walltime"; exit(1); } } } } ------ Rule : 23 # Title : Core number restrictions # Description : Count the number of cores requested and reject if the queue does not allow this # Check the resources my $resources_def=$ref_resource_list->[0]; my $n_core_per_cpus=6; my $n_cpu_per_node=2; my $core=0; my $cpu=0; my $node=0; foreach my $r (@{$resources_def->[0]}) { foreach my $resource (@{$r->{resources}}) { if ($resource->{resource} eq "core") {$core=$resource->{value};} if ($resource->{resource} eq "cpu") {$cpu=$resource->{value};} if ($resource->{resource} eq "nodes") {$node=$resource->{value};} if ($resource->{resource} eq "network_address") {$node=$resource->{value};} } } # Now, calculate the number of total cores my $n_cores=0; if ($node == 0 && $cpu != 0 && $core == 0) { $n_cores = $cpu*$n_core_per_cpus; }elsif ($node != 0 && $cpu == 0 && $core == 0) { $n_cores = $node*$n_cpu_per_node*$n_core_per_cpus; }elsif ($node != 0 && $cpu == 0 && $core != 0) { $n_cores = $node*$core; }elsif ($node == 0 && $cpu != 0 && $core != 0) { $n_cores = $cpu*$core; }elsif ($node == 0 && $cpu == 0 && $core != 0) { $n_cores = $core; } else { $n_cores = $node*$cpu*$core; } print " [CORES COUNT] You requested $n_cores cores"; # Now the restrictions: my $short=132; # 132 cores = 11 noeuds my $medium=132; # 132 cores = 11 noeuds my $long=132; # 132 cores = 11 noeuds if ("$queue_name" eq "long" && $n_cores > $long) { print "\ [CORES COUNT] Too many cores for this queue (max is $long)!"; exit(1); } if ("$queue_name" eq "medium" && $n_cores > $medium) { print "\ [CORES COUNT] Too many cores for this queue (max is $medium)!"; exit(1); } if ("$queue_name" eq "short" && $n_cores > $short) { print "\ [CORES COUNT] Too many cores for this queue (max is $short)!"; exit(1); } ------ Rule : 24 # Title : Restriction des jobs long ou medium # Description : Les jobs long ou medium ne peuvent pas tourner sur les ressources ayant la propriété long=NO if ("$queue_name" eq "long" || "$queue_name" eq "medium"){ if ($jobproperties ne ""){ $jobproperties = "($jobproperties) AND long = \\'YES\\'"; }else{ $jobproperties = "long = \\'YES\\'"; } print "[ADMISSION RULE] Adding long/medium jobs resources restrictions"; }
Description: Interactive jobs with no name are automatically named “interactive unnamed job”
if (($jobType eq "INTERACTIVE") and ($job_name eq //)){ $job_name = 'interactive unnamed job'; }
Description: with this admission rule the longer the job walltime is, the fewer resources are available for the job. This encourages a user to shorten the walltime, so that he can use more resources.
First we define the max_walltime
property and split nodes in several sets with different max_walltime
values.
oarproperty -a max_walltime for node in <set 1>; do oarnodesetting -h node -p max_walltime=<walltime of set 1> done for node in <set 2>; do oarnodesetting -h node -p max_walltime=<walltime of set 2> done ...
A node with the max_walltime
property set to a lower value than the actual walltime requested by a job can not be used by that job, as enforced by the following admission rule:
if ((($jobType eq "PASSIVE") or ($jobType eq "INTERACTIVE")) and !(grep(/^besteffort/, @{$type_list})) and ($queue_name ne "admin")) { foreach my $mold (@{$ref_resource_list}) { if (defined($mold->[1])) { foreach my $r (@{$mold->[0]}){ my $resource = $r->{resources}[0]->{resource}; if ($resource =~ /(network_address|host|cpu|core)/){ my $max_walltime = $mold->[1] / 60; # convert defined walltime in minutes; my $current_properties = $r->{property}; if ($current_properties ne ""){ $r->{property} = "($current_properties) AND max_walltime >= $max_walltime"; } else { $r->{property} = "max_walltime >= $max_walltime"; } } } } } }