This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
wiki:managing_resources_cpu_gpu [2018/10/16 09:07] – [Managing GPUs] neyron | wiki:managing_resources_cpu_gpu [2019/06/25 14:25] – [Second scenario, more complexe] neyron | ||
---|---|---|---|
Line 2: | Line 2: | ||
====== Managing processing unit topologies ====== | ====== Managing processing unit topologies ====== | ||
- | OAR's resources tables provides several kinds of information: | + | The OAR database '' |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | They are called OAR resources | + | They are called OAR resources |
- | In database these 3 kinds of properties are all stored as columns of the table. A rows then gives the set of properties for one resource. | + | In database these 4 kinds of resource |
- | Given a hierarchy (chosen by the administrator for its cluster setup, for instance cluster/ | + | **Given a hierarchy** (chosen by the administrator for its cluster setup, for instance cluster/ |
| | ||
- | One rule must be kept in mind: any unique object in the resources hierarchy must have a unique id among its set of object. For example: | + | One rule must be kept in mind: **any unique object in the resources hierarchy must have a unique id among its set of object**. For example: |
* any of the cores (of any of the CPUs, of any of the hosts...) must have a unique id among the whole set of cores ; | * any of the cores (of any of the CPUs, of any of the hosts...) must have a unique id among the whole set of cores ; | ||
* any of the CPUs (of any of the nodes, of any of the clusters...) | * any of the CPUs (of any of the nodes, of any of the clusters...) | ||
* and so one of any resource. | * and so one of any resource. | ||
- | Then, when it comes to the hardware identifiers (cpusets, or see below for GPU devices id), the administrator must take a special attention so that a correct mapping is done between the logical hierarchy (e.g. id of the host, CPUs, cores, hyperthreads) and the hardware processing unit ids (cpuset value). Using a tool such as '' | + | Then, when it comes to the hardware identifiers (cpusets, or see below for GPU devices id), the administrator must take a special attention so that a **correct mapping** is done between the **logical hierarchy** (e.g. id of the host, CPUs, cores, hyperthreads) and the **hardware** processing unit **ids** (cpuset value). Using a tool such as '' |
- | The '' | + | The basic commands to work with resources are: |
+ | * The '' | ||
+ | * The '' | ||
+ | * The '' | ||
- | Two meta-command are provided to build the resource table (using underneath the '' | + | But two meta-command are provided to build the resource table (using underneath the '' |
* '' | * '' | ||
* '' | * '' | ||
Line 32: | Line 35: | ||
Support of Nvidia GPU devices was added to OAR and will ship with OAR 2.5.8 (already ship with RC versions, starting with 2.5.8 RC6). | Support of Nvidia GPU devices was added to OAR and will ship with OAR 2.5.8 (already ship with RC versions, starting with 2.5.8 RC6). | ||
- | Meanwhile, for those who cannot wait and since this only involves some configuration of the resources and using the latest version of the job resource manager script taken from the sources | + | Meanwhile, for those who cannot wait and since this only involves some configuration of the resources and using the latest version of the job resource manager script taken from git (the job resource manager is part of the configuration files of OAR, which the administrator can modify), one can // |
See: | See: | ||
Line 38: | Line 41: | ||
* some explanations about it: https:// | * some explanations about it: https:// | ||
- | Next releases | + | The next release |
Line 66: | Line 69: | ||
Also, if some nodes do not have any GPU, you could set the value of the property for the corresponding resources to '' | Also, if some nodes do not have any GPU, you could set the value of the property for the corresponding resources to '' | ||
- | ===== Second scenario, more complexe | + | ===== Second scenario, more complex |
Lets assume now that you have a cluster of 3 nodes with 32 GB of RAM and per node: | Lets assume now that you have a cluster of 3 nodes with 32 GB of RAM and per node: | ||
* 2 CPUs of 6 cores each | * 2 CPUs of 6 cores each |