This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
wiki:coupling_oarsh_with_gnu_parallel_to_mimic_salloc_sbatch [2020/02/27 22:04] – neyron | wiki:coupling_oarsh_with_gnu_parallel_to_mimic_salloc_srun [2020/03/03 13:28] – [PoC with cores] neyron | ||
---|---|---|---|
Line 3: | Line 3: | ||
E.g. 1 batch per core (or gpu), in a job which has many nodes/ | E.g. 1 batch per core (or gpu), in a job which has many nodes/ | ||
- | This requires the change | + | This requires the changes |
- | https:// | + | [[https:// |
- | (not merged yet) | + | (not merged yet). |
- | ==== PoC with cores ==== | + | ===== PoC with cores ===== |
- | == Create a job with 2 nodes (and all their cores, here 4) == | + | PoC in [[oar-docker]] |
+ | ==== Create a job with 2 nodes (and all their cores, here 4) ==== | ||
<code bash> | <code bash> | ||
Line 17: | Line 18: | ||
</ | </ | ||
- | == Create the parallel sshloginfile that defines the connector to each core == | + | ==== Create the parallel sshloginfile that defines the connector to each core ==== |
<code bash> | <code bash> | ||
docker@frontend ~$ oarstat -j 1 -p | oarprint -f - core -P cpuset,host -F " | docker@frontend ~$ oarstat -j 1 -p | oarprint -f - core -P cpuset,host -F " | ||
Line 31: | Line 32: | ||
We force the '' | We force the '' | ||
- | == Create a sample script == | + | ==== Create a sample script |
<code bash> | <code bash> | ||
docker@frontend ~$ cat <<' | docker@frontend ~$ cat <<' | ||
Line 43: | Line 44: | ||
</ | </ | ||
- | == Test a run with a batch of 10 inputs == | + | ==== Test a run with a batch of 10 inputs |
<code bash> | <code bash> | ||
docker@frontend ~$ seq 10 | parallel --slf cores ./ | docker@frontend ~$ seq 10 | parallel --slf cores ./ | ||
Line 59: | Line 60: | ||
As we can see, every job is indeed run in the cpuset with only 1 logical cpu available for the execution ! | As we can see, every job is indeed run in the cpuset with only 1 logical cpu available for the execution ! | ||
- | ==== PoC with GPUs in Grid' | + | ===== PoC with GPUs in Grid' |
Same can be done with GPUs: run of a batch of jobs that executes each on a single GPU only. | Same can be done with GPUs: run of a batch of jobs that executes each on a single GPU only. | ||
Here we have 2 nodes (chifflet-3 and chifflet-7) with 2 GeForce GPUs each. | Here we have 2 nodes (chifflet-3 and chifflet-7) with 2 GeForce GPUs each. | ||
- | == Generate the parallel sshlogin file to execute on each GPU == | + | ==== Generate the parallel sshlogin file to execute on each GPU ==== |
From the head node of the OAR job, chifflet-3: | From the head node of the OAR job, chifflet-3: | ||
<code bash> | <code bash> | ||
Line 75: | Line 78: | ||
Here we use the '' | Here we use the '' | ||
- | == Create a new sample script == | + | ==== Create a new sample script |
<code bash> | <code bash> | ||
[pneyron@chifflet-3 ~](1733271--> | [pneyron@chifflet-3 ~](1733271--> | ||
Line 93: | Line 96: | ||
</ | </ | ||
- | == Run parallel == | + | ==== Run parallel |
<code bash> | <code bash> | ||
[pneyron@chifflet-3 ~](1733271--> | [pneyron@chifflet-3 ~](1733271--> |