Database scheme
Note : all dates and duration are stored in an integer manner (number of
seconds since the EPOCH).
accounting
Fields |
Types |
Descriptions |
window_start |
INT UNSIGNED |
start date of the accounting interval |
window_stop |
INT UNSIGNED |
stop date of the accounting interval |
accounting_user |
VARCHAR(20) |
user name |
accounting_project |
VARCHAR(255) |
name of the related project |
queue_name |
VARCHAR(100) |
queue name |
consumption_type |
ENUM(“ASKED”,
“USED”) |
“ASKED” corresponds to the walltimes
specified by the user. “USED”
corresponds to the effective time
used by the user. |
consumption |
INT UNSIGNED |
number of seconds used |
Primary key: | window_start, window_stop, accounting_user, queue_name,
accounting_project, consumption_type |
Index fields: | window_start, window_stop, accounting_user, queue_name,
accounting_project, consumption_type |
This table is a summary of the consumption for each user on each queue. This
increases the speed of queries about user consumptions and statistic
generation.
Data are inserted through the command oaraccounting (when a job is treated
the field accounted in table jobs is passed into “YES”). So it is possible to
regenerate this table completely in this way :
Delete all data of the table:
Set the field accounted in the table jobs to “NO” for each row:
UPDATE jobs SET accounted = "NO";
Run the oaraccounting command.
You can change the amount of time for each window : edit the oar configuration
file and change the value of the tag ACCOUNTING_WINDOW.
schema
Fields |
Types |
Descriptions |
version |
VARCHAR(255) |
database schema version number |
name |
VARCHAR(255) |
optional name |
This table is used to store the version of the database schema.
So the oar-database command be used to automatically upgrade the schema from
any version with:
admission_rules
Fields |
Types |
Descriptions |
id |
INT UNSIGNED |
id number |
rule |
TEXT |
rule written in Perl applied when a
job is going to be registered |
Primary key: | id |
Index fields: | None |
You can use these rules to change some values of some properties when a job is
submitted. So each admission rule is executed in the order of the id field and
it can set several variables. If one of them exits then the others will not
be evaluated and oarsub returns an error.
The rules can be added with the following command:
Some examples are better than a long description:
Specify the default value for queue parameter
if (not defined($queue_name)) {
$queue_name="default";
}
Avoid users except oar to go in the admin queue
if (($queue_name eq "admin") && ($user ne "oar")) {
die("[ADMISSION RULE] Only oar user can submit jobs in the admin queue\n");
}
Restrict the maximum of the walltime for interactive jobs
my $max_walltime = OAR::IO::sql_to_duration("12:00:00");
if ($jobType eq "INTERACTIVE"){
foreach my $mold (@{$ref_resource_list}){
if (
(defined($mold->[1])) and
($max_walltime < $mold->[1])
){
print("[ADMISSION RULE] Walltime to big for an INTERACTIVE job so it is set to $max_walltime.\n");
$mold->[1] = $max_walltime;
}
}
}
Specify the default walltime
my $default_wall = OAR::IO::sql_to_duration("2:00:00");
foreach my $mold (@{$ref_resource_list}){
if (!defined($mold->[1])){
print("[ADMISSION RULE] Set default walltime to $default_wall.\n");
$mold->[1] = $default_wall;
}
}
How to perform actions if the user name is in a file
open(FILE, "/tmp/users.txt");
while (($queue_name ne "admin") and ($_ = <FILE>)){
if ($_ =~ m/^\\s*$user\\s*$/m){
print("[ADMISSION RULE] Change assigned queue into admin\n");
$queue_name = "admin";
}
}
close(FILE);
How to automatically add a job type depending of the walltime and an
estimation of the number of resources of the job
foreach my $e (estimate_job_nb_resources($dbh_ro, $ref_resource_list, $jobproperties)){
#print("AREA: $e->{nbresources} x $e->{walltime} = ".$e->{nbresources} * $e->{walltime}."\n");
if ($e->{nbresources} * $e->{walltime} > 24*3600*1){
print("[ADMISSION RULE] Your job is of the 'big' type\n");
push(@{$type_list},"big");
last;
}
}
You can print all the admission rules with:
event_logs
Fields |
Types |
Descriptions |
event_id |
INT UNSIGNED |
event identifier |
type |
VARCHAR(50) |
event type |
job_id |
INT UNSIGNED |
job related of the event |
date |
INT UNSIGNED |
event date |
description |
VARCHAR(255) |
textual description of the event |
to_check |
ENUM(‘YES’, ‘NO’) |
specify if the module NodeChangeState
must check this event to Suspect or not
some nodes |
Primary key: | event_id |
Index fields: | type, to_check |
The different event types are:
- “PING_CHECKER_NODE_SUSPECTED” : the system detected via the module “finaud”
that a node is not responding.
- “PROLOGUE_ERROR” : an error occurred during the execution of the job
prologue (exit code != 0).
- “EPILOGUE_ERROR” : an error occurred during the execution of the job
epilogue (exit code != 0).
- “CANNOT_CREATE_TMP_DIRECTORY” : OAR cannot create the directory where all
information files will be stored.
- “CAN_NOT_WRITE_NODE_FILE” : the system was not able to write file which had
to contain the node list on the first node (/tmp/OAR_job_id).
- “CAN_NOT_WRITE_PID_FILE” : the system was not able to write the file which
had to contain the pid of oarexec process on the first node
(/tmp/pid_of_oarexec_for_job_id).
- “USER_SHELL” : the system was not able to get informations about the user
shell on the first node.
- “EXIT_VALUE_OAREXEC” : the oarexec process terminated with an unknown exit
code.
- “SEND_KILL_JOB” : signal that OAR has transmitted a kill signal to the
oarexec of the specified job.
- “LEON_KILL_BIPBIP_TIMEOUT” : Leon module has detected that something wrong
occurred during the kill of a job and so kill the local bipbip process.
- “EXTERMINATE_JOB” : Leon module has detected that something wrong occurred
during the kill of a job and so clean the database and terminate the job
artificially.
- “WORKING_DIRECTORY” : the directory from which the job was submitted does
not exist on the node assigned by the system.
- “OUTPUT_FILES” : OAR cannot write the output files (stdout and stderr) in
the working directory.
- “CANNOT_NOTIFY_OARSUB” : OAR cannot notify the oarsub process for an
interactive job (maybe the user has killed this process).
- “WALLTIME” : the job has reached its walltime.
- “SCHEDULER_REDUCE_NB_NODES_FOR_RESERVATION” : this means that there is not
enough nodes for the reservation and so the scheduler do the best and
gives less nodes than the user wanted (this occurres when nodes become
Suspected or Absent).
- “BESTEFFORT_KILL” : the job is of the type besteffort and was killed
because a normal job wanted the nodes.
- “FRAG_JOB_REQUEST” : someone wants to delete a job.
- “CHECKPOINT” : the checkpoint signal was sent to the job.
- “CHECKPOINT_ERROR” : OAR cannot send the signal to the job.
- “CHECKPOINT_SUCCESS” : system has sent the signal correctly.
- “SERVER_EPILOGUE_TIMEOUT” : epilogue server script has time outed.
- “SERVER_EPILOGUE_EXIT_CODE_ERROR” : epilogue server script did not return 0.
- “SERVER_EPILOGUE_ERROR” : cannot find epilogue server script file.
- “SERVER_PROLOGUE_TIMEOUT” : prologue server script has time outed.
- “SERVER_PROLOGUE_EXIT_CODE_ERROR” : prologue server script did not return 0.
- “SERVER_PROLOGUE_ERROR” : cannot find prologue server script file.
- “CPUSET_CLEAN_ERROR” : OAR cannot clean correctly cpuset files for a job
on the remote node.
- “MAIL_NOTIFICATION_ERROR” : a mail cannot be sent.
- “USER_MAIL_NOTIFICATION” : user mail notification cannot be performed.
- “USER_EXEC_NOTIFICATION_ERROR” : user script execution notification cannot
be performed.
- “BIPBIP_BAD_JOBID” : error when retrieving informations about a running job.
- “BIPBIP_CHALLENGE” : OAR is configured to detach jobs when they are launched
on compute nodes and the job return a bad challenge number.
- “RESUBMIT_JOB_AUTOMATICALLY” : the job was automatically resubmitted.
- “WALLTIME” : the job reached its walltime.
- “REDUCE_RESERVATION_WALLTIME” : the reservation job was shrunk.
- “SSH_TRANSFER_TIMEOUT” : node OAR part script was too long to transfer.
- “BAD_HASHTABLE_DUMP” : OAR transfered a bad hashtable.
- “LAUNCHING_OAREXEC_TIMEOUT” : oarexec was too long to initialize itself.
- “RESERVATION_NO_NODE” : All nodes were detected as bad for the reservation
job.
event_log_hostnames
Fields |
Types |
Descriptions |
event_id |
INT UNSIGNED |
event identifier |
hostname |
VARCHAR(255) |
name of the node where the event
has occured |
Primary key: | event_id |
Index fields: | hostname |
This table stores hostnames related to events like
“PING_CHECKER_NODE_SUSPECTED”.
files
Fields |
Types |
Descriptions |
idFile |
INT UNSIGNED |
|
md5sum |
VARCHAR(255) |
|
location |
VARCHAR(255) |
|
method |
VARCHAR(255) |
|
compression |
VARCHAR(255) |
|
size |
INT UNSIGNED |
|
Primary key: | idFile |
Index fields: | md5sum |
frag_jobs
Fields |
Types |
Descriptions |
frag_id_job |
INT UNSIGNED |
job id |
frag_date |
INT UNSIGNED |
kill job decision date |
frag_state |
ENUM(‘LEON’, ‘TIMER_ARMED’
, ‘LEON_EXTERMINATE’,
‘FRAGGED’)
DEFAULT ‘LEON’ |
state to tell Leon what to do |
Primary key: | frag_id_job |
Index fields: | frag_state |
What do these states mean:
- “LEON” : the Leon module must try to kill the job and change the state into
“TIMER_ARMED”.
- “TIMER_ARMED” : the Sarko module must wait a response from the job during
a timeout (default is 60s)
- “LEON_EXTERMINATE” : the Sarko module has decided that the job time outed and
asked Leon to clean up the database.
- “FRAGGED” : job is fragged.
gantt_jobs_resources
Fields |
Types |
Descriptions |
moldable_job_id |
INT UNSIGNED |
moldable job id |
resource_id |
INT UNSIGNED |
resource assigned to the job |
Primary key: | moldable_job_id, resource_id |
Index fields: | None |
This table specifies which resources are attributed to which jobs.
gantt_jobs_resources_visu
Fields |
Types |
Descriptions |
moldable_job_id |
INT UNSIGNED |
moldable job id |
resource_id |
INT UNSIGNED |
resource assigned to the job |
Primary key: | moldable_job_id, resource_id |
Index fields: | None |
This table is the same as gantt_jobs_resources and is used by visualisation
tools. It is updated atomically (a lock is used).
gantt_jobs_predictions
Fields |
Types |
Descriptions |
moldable_job_id |
INT UNSIGNED |
job id |
start_time |
INT UNSIGNED |
date when the job is scheduled to start |
Primary key: | moldable_job_id |
Index fields: | None |
With this table and gantt_jobs_resources you can know exactly what are the
decisions taken by the schedulers for each waiting jobs.
note: | The special job id “0” is used to store the scheduling reference date. |
gantt_jobs_predictions_visu
Fields |
Types |
Descriptions |
moldable_job_id |
INT UNSIGNED |
job id |
start_time |
INT UNSIGNED |
date when the job is scheduled to start |
Primary key: | job_id |
Index fields: | None |
This table is the same as gantt_jobs_predictions and is used by visualisation
tools. It is made up to date in an atomic action (with a lock).
jobs
Fields |
Types |
Descriptions |
job_id |
INT UNSIGNED |
job identifier |
array_id |
INT |
array identifier |
array_index |
INT |
index of the job in the array |
initial_request |
TEXT |
oarsub initial arguments |
job_name |
VARCHAR(100) |
name given by the user |
cpuset_name |
VARCHAR(255) |
name of the cpuset directory used for
this job on each nodes |
job_type |
ENUM(‘INTERACTIVE’,
‘PASSIVE’) DEFAULT
‘PASSIVE’ |
specify if the user wants to launch a
program or get an interactive shell |
info_type |
VARCHAR(255) |
some informations about oarsub
command |
state |
ENUM(‘Waiting’,’Hold’,
‘toLaunch’, ‘toError’,
‘toAckReservation’,
‘Launching’, ‘Running’
‘Suspended’,
‘Resuming’,
, ‘Finishing’,
‘Terminated’, ‘Error’) |
job state |
reservation |
ENUM(‘None’,
‘toSchedule’,
‘Scheduled’) DEFAULT
‘None’ |
specify if the job is a reservation
and the state of this one |
message |
VARCHAR(255) |
readable information message for the
user |
job_user |
VARCHAR(255) |
user name |
command |
TEXT |
program to run |
queue_name |
VARCHAR(100) |
queue name |
properties |
TEXT |
properties that assigned nodes must
match |
launching_directory |
TEXT |
path of the directory where to launch
the user process |
submission_time |
INT UNSIGNED |
date when the job was submitted |
start_time |
INT UNSIGNED |
date when the job was launched |
stop_time |
INT UNSIGNED |
date when the job was stopped |
file_id |
INT UNSIGNED |
|
accounted |
ENUM(“YES”, “NO”)
DEFAULT “NO” |
specify if the job was considered by
the accounting mechanism or not |
notify |
VARCHAR(255) |
gives the way to notify the user about
the job (mail or script ) |
assigned_moldable_job |
INT UNSIGNED |
moldable job chosen by the scheduler |
checkpoint |
INT UNSIGNED |
number of seconds before the walltime
to send the checkpoint signal to the
job |
checkpoint_signal |
INT UNSIGNED |
signal to use when checkpointing the
job |
stdout_file |
TEXT |
file name where to redirect program
STDOUT |
stderr_file |
TEXT |
file name where to redirect program
STDERR |
resubmit_job_id |
INT UNSIGNED |
if a job is resubmitted then the new
one store the previous |
project |
VARCHAR(255) |
arbitrary name given by the user or an
admission rule |
suspended |
ENUM(“YES”,”NO”) |
specify if the job was suspended
(oarhold) |
job_env |
TEXT |
environment variables to set for the
job |
exit_code |
INT DEFAULT 0 |
exit code for passive jobs |
job_group |
VARCHAR(255) |
not used |
Primary key: | job_id |
Index fields: | state, reservation, queue_name, accounted, suspended |
Explications about the “state” field:
- “Waiting” : the job is waiting OAR scheduler decision.
- “Hold” : user or administrator wants to hold the job (oarhold command).
So it will not be scheduled by the system.
- “toLaunch” : the OAR scheduler has attributed some nodes to the job. So it
will be launched.
- “toError” : something wrong occurred and the job is going into the error
state.
- “toAckReservation” : the OAR scheduler must say “YES” or “NO” to the waiting
oarsub command because it requested a reservation.
- “Launching” : OAR has launched the job and will execute the user command
on the first node.
- “Running” : the user command is executing on the first node.
- “Suspended” : the job was in Running state and there was a request
(oarhold with “-r” option) to suspend this job. In this state other jobs
can be scheduled on the same resources (these resources has the
“suspended_jobs” field to “YES”).
- “Finishing” : the user command has terminated and OAR is doing work internally
- “Terminated” : the job has terminated normally.
- “Error” : a problem has occurred.
Explications about the “reservation” field:
- “None” : the job is not a reservation.
- “toSchedule” : the job is a reservation and must be approved by the
scheduler.
- “Scheduled” : the job is a reservation and is scheduled by OAR.
job_dependencies
Fields |
Types |
Descriptions |
job_id |
INT UNSIGNED |
job identifier |
job_id_required |
INT UNSIGNED |
job needed to be completed before
launching job_id |
Primary key: | job_id, job_id_required |
Index fields: | job_id, job_id_required |
This table is feeded by oarsub command with the “-a” option.
moldable_job_descriptions
Fields |
Types |
Descriptions |
moldable_id |
INT UNSIGNED |
moldable job identifier |
moldable_job_id |
INT UNSIGNED |
corresponding job identifier |
moldable_walltime |
INT UNSIGNED |
instance duration |
Primary key: | moldable_id |
Index fields: | moldable_job_id |
A job can be described with several instances. Thus OAR scheduler can choose one
of them. For example it can calculate which instance will finish first.
So this table stores all instances for all jobs.
job_resource_groups
Fields |
Types |
Descriptions |
res_group_id |
INT UNSIGNED |
group identifier |
res_group_moldable_id |
INT UNSIGNED |
corresponding moldable job identifier |
res_group_property |
TEXT |
SQL constraint properties |
Primary key: | res_group_id |
Index fields: | res_group_moldable_id |
As you can specify job global properties with oarsub and the “-p” option,
you can do the same thing for each resource groups that you define with
the “-l” option.
job_resource_descriptions
Fields |
Types |
Descriptions |
res_job_group_id |
INT UNSIGNED |
corresponding group identifier |
res_job_resource_type |
VARCHAR(255) |
resource type (name of a field in
resources) |
res_job_value |
INT |
wanted resource number |
res_job_order |
INT UNSIGNED |
order of the request |
Primary key: | res_job_group_id, res_job_resource_type, res_job_order |
Index fields: | res_job_group_id |
This table store the hierarchical resource description given with oarsub and
the “-l” option.
job_state_logs
Fields |
Types |
Descriptions |
job_state_log_id |
INT UNSIGNED |
identifier |
job_id |
INT UNSIGNED |
corresponding job identifier |
job_state |
ENUM(‘Waiting’,
‘Hold’, ‘toLaunch’,
‘toError’,
‘toAckReservation’,
‘Launching’,
‘Finishing’,
‘Running’,
‘Suspended’,
‘Resuming’,
‘Terminated’,
‘Error’) |
job state during the interval |
date_start |
INT UNSIGNED |
start date of the interval |
date_stop |
INT UNSIGNED |
end date of the interval |
Primary key: | job_state_log_id |
Index fields: | job_id, job_state |
This table keeps informations about state changes of jobs.
job_types
Fields |
Types |
Descriptions |
job_type_id |
INT UNSIGNED |
identifier |
job_id |
INT UNSIGNED |
corresponding job identifier |
type |
VARCHAR(255) |
job type like “deploy”, “timesharing”,
… |
type_index |
ENUM(‘CURRENT’,
‘LOG’) |
index field |
Primary key: | job_type_id |
Index fields: | job_id, type |
This table stores job types given with the oarsub command and “-t” options.
resources
Fields |
Types |
Descriptions |
resource_id |
INT UNSIGNED |
resource identifier |
type |
VARCHAR(100)
DEFAULT “default” |
resource type (used for licence
resources for example) |
network_address |
VARCHAR(100) |
node name (used to connect via SSH) |
state |
ENUM(‘Alive’, ‘Dead’
, ‘Suspected’,
‘Absent’) |
resource state |
next_state |
ENUM(‘UnChanged’,
‘Alive’, ‘Dead’,
‘Absent’,
‘Suspected’) DEFAULT
‘UnChanged’ |
state for the resource to switch |
finaud_decision |
ENUM(‘YES’, ‘NO’)
DEFAULT ‘NO’ |
tell if the actual state results in a
“finaud” module decision |
next_finaud_decision |
ENUM(‘YES’, ‘NO’)
DEFAULT ‘NO’ |
tell if the next node state results in
a “finaud” module decision |
state_num |
INT |
corresponding state number (useful
with the SQL “ORDER” query) |
suspended_jobs |
ENUM(‘YES’,’NO’) |
specify if there is at least one
suspended job on the resource |
scheduler_priority |
INT UNSIGNED |
arbitrary number given by the system
to select resources with more
intelligence |
switch |
VARCHAR(50) |
name of the switch |
cpu |
INT UNSIGNED |
global cluster cpu number |
cpuset |
INT UNSIGNED |
field used with the
JOB_RESOURCE_MANAGER_PROPERTY_DB_FIELD |
besteffort |
ENUM(‘YES’,’NO’) |
accept or not besteffort jobs |
deploy |
ENUM(‘YES’,’NO’) |
specify if the resource is deployable |
expiry_date |
INT UNSIGNED |
field used for the desktop computing
feature |
desktop_computing |
ENUM(‘YES’,’NO’) |
tell if it is a desktop computing
resource (with an agent) |
last_job_date |
INT UNSIGNED |
store the date when the resource
was used for the last time |
available_upto |
INT UNSIGNED |
used with compute mode features to
know if an Absent resource can be
switch on |
Primary key: | resource_id |
Index fields: | state, next_state, type, suspended_jobs |
State explications:
- “Alive” : the resource is ready to accept a job.
- “Absent” : the oar administrator has decided to pull out the resource. This
computer can come back.
- “Suspected” : OAR system has detected a problem on this resource and so has
suspected it (you can look in the event_logs table to know what has
happened). This computer can come back (automatically if this is a
“finaud” module decision).
- “Dead” : The oar administrator considers that the resource will not come back
and will be removed from the pool.
This table permits to specify different properties for each resources. These can
be used with the oarsub command (“-p” and “-l” options).
You can add your own properties with oarproperty command.
These properties can be updated with the oarnodesetting command (“-p” option).
Several properties are added by default:
- switch : you have to register the name of the switch where the node is
plugged.
- cpu : this is a unique name given to each cpus. This enables OAR scheduler
to distinguish all cpus.
- cpuset : this is the name of the cpu on the node. The Linux kernel sets this
to an integer beginning at 0. This field is linked to the configuration tag
JOB_RESOURCE_MANAGER_PROPERTY_DB_FIELD.
resource_logs
Fields |
Types |
Descriptions |
resource_log_id |
INT UNSIGNED |
unique id |
resource_id |
INT UNSIGNED |
resource identifier |
attribute |
VARCHAR(255) |
name of corresponding field in
resources |
value |
VARCHAR(255) |
value of the field |
date_start |
INT UNSIGNED |
interval start date |
date_stop |
INT UNSIGNED |
interval stop date |
finaud_decision |
ENUM(‘YES’,’NO’) |
store if this is a system change or a
human one |
Primary key: | None |
Index fields: | resource_id, attribute |
This table permits to keep a trace of every property changes (consequence of
the oarnodesetting command with the “-p” option).
assigned_resources
Fields |
Types |
Descriptions |
moldable_job_id |
INT UNSIGNED |
job id |
resource_id |
INT UNSIGNED |
resource assigned to the job |
Primary key: | moldable_job_id, resource_id |
Index fields: | moldable_job_id |
This table keeps informations for jobs on which resources they were
scheduled.
queues
Fields |
Types |
Descriptions |
queue_name |
VARCHAR(100) |
queue name |
priority |
INT UNSIGNED |
the scheduling priority |
scheduler_policy |
VARCHAR(100) |
path of the associated scheduler |
state |
ENUM(‘Active’,
‘notActive’)
DEFAULT ‘Active’ |
permits to stop the scheduling for a
queue |
Primary key: | queue_name |
Index fields: | None |
This table contains the schedulers executed by the oar_meta_scheduler module.
Executables are launched one after one in the specified priority.
challenges
Fields |
Types |
Descriptions |
job_id |
INT UNSIGNED |
job identifier |
challenge |
VARCHAR(255) |
challenge string |
ssh_private_key |
TEXT DEFAULT NULL |
ssh private key given by the user
(in grid usage it enables to connect
onto all nodes of the job of all
clusers with oarsh) |
ssh_public_key |
TEXT DEFAULT NULL |
ssh public key |
Primary key: | job_id |
Index fields: | None |
This table is used to share a secret between OAR server and oarexec process on
computing nodes (avoid a job id being stolen/forged by malicious user).
For security reasons, this table must not be readable for a database
account given to users who want to access OAR internal informations(like statistics).