On using (hyper)threads as the leafs of the resources tree

If HyperThreading is enabled in BIOS, using threads as the leafs of OAR (implicit) resources tree is quite simple.

Using the oar_resource_init script

The last version of the oar_resource_init script comes with the -H and -T options. It generates the oarnodesetting commands with the thread properties.

Adapting an existing resource setup

Here we consider an existing resource setup to which one would like to had the thread as leaf resources.

First you need to define the thread property:

$ oarproperty -a thread

Assuming your cluster is homogeneous, and that resources were consistently created (i.e. the ids of the properties of the resources are well set), you can add to every resource lines the property: thread = (2 * core - 1) (oarnodesetting -p … ), and then for each of them create a new resources (oarnodesetting -a …) with exactly the same values for all properties, except thread = (2 * core) and cpuset = (original cpuset value + # cores per cpu) (using hwloc in a job afterward to verify the correct cpuset settings could be a good idea).

Of course, any shell scripting language will be your friend.

(Remember that the values for the properties which define the hierarchy of the resources must be unique to the resource)

NB: we assume here that
  • HyperThreading is 2 threads per core.
  • threads of core N = N and N+#cores

On using hyper-threads as overbooking processors

The idea here, is to setup some admission rules in order to steer some jobs to use the 2 threads of the cores when some conditions are met (small jobs, jobs for some known computation types for which hyperthreading is relevant…).

To be completed.

On offlining/onlining hyper-threads with OAR

In this case, cores are the leaf resources (no thread resource property in OAR resources table), but we would like to enable users to possibly get the 2 hyper-threads associated to each core it in their job.

It's indeed easy to disable a 2nd thread for a core (see /sys/device/system/cpu), but once this 2nd thread is disabled, it's not so easy to find its number again if we did not keep the information somewhere. Indeed the “linux processor” does not show up in hwloc or /proc/cpuinfo anymore, and /sys does not seem very helpful either. One may argue that the numbering seems to be: threads of core N = N and N+#cores, but can we rely on that on any machine ?

This is an issue if we want to consider disabling the 2nd thread of only some cores of a host, not all. Otherwise indeed, disabling all second threads or re-enabling all of them is straight-forward using a for loop and looking at /sys/devices/system/cpu/possible.

For that purpose, the cpuset field of OAR was changed to string with OAR 2.5.4+. This allows setting the value of the cpuset field in the resource table to,for instance, “0,8” instead of just 0 (core 0 has threads with logical cpu id 0 and 8). Then the job resource manager can handle that value very easily whenever an HT job type is in use by a job.

Pierre Neyron 2015/07/05 10:14

You could leave a comment if you were logged in.
wiki/using_hyperthreading_on_nodes.txt · Last modified: 2015/09/04 17:24 by neyron
Recent changes RSS feed GNU Free Documentation License 1.3 Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki