Deploying OAR cluster upon Grid5000

From WikiOAR

Revision as of 15:03, 14 May 2009 by Ygeorgiou (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This is a small Tutorial on how to deploy an OAR cluster upon a reservation of nodes on Grid5000...

I have created an image based on debian lenny distribution, kernel 2.6.26 and OARv2.4.0. You can check out the procedure I followed for its construction here..[1]

The name of the image is lenny2.6.26-OAR2.4.0 and you can find it on nancy and rennes and grenoble clusters for the moment.

Here is the procedure to follow for the deployment of an OAR cluster upon Grid5000:

Contents

Connection to a grid5000 frontal server

I)you connect to grenoble frontal server by writing:

ssh USERNAME@digitalis.imag.fr

II)then you connect to nancy frontal server (or rennes or...)

ssh nancy.grid5000.fr 

Reservation of nodes

III)then you will reserve some nodes on mode Interactive

oarsub -l /nodes=4,walltime=3 -p 'cluster="griffon"' -t deploy -I

on the -l you put the number of nodes and the duration of your job on the -p you put the name of the cluster (here you put either "griffon" or "grelon" for nancy OR "paradent" for rennes) on the -t you put deploy so that you can make a deployment and you finally got to add -I so that your job becomes Intereactive

with man oarsub you can see all options

IV) Once the allocation of nodes is granted, you can check out your job

oarstat |grep USERNAME

Preparation for environment deployment

V)For the deployment we will use katapult (which is an automation tool that uses kadeploy which is the gri5000 environment deployment software)

You can directly copy the katapult executable from my account. (This step you have to do it only once).

cp /home/grenoble/ygeorgiou/bin/katapult .

VI)You can see a description of the environment that you will be using with this command:

kaenvironments -e lenny2.6.26-OAR2.4.0 -l ygeorgiou

Environemnt Deployment

VII)then you can deploy the environment on the 4 nodes you have allocated with the following command:

~/bin/katapult --min-deployed-nodes 4 --max-deploy-runs 30 --deploy-env lenny2.6.26-OAR2.4.0 --file $OAR_NODEFILE --deploy-user ygeorgiou

If everything goes fine the deployment will start and it will take about 3 to 5 minutes ... in the end you will see something like this:

Deploy  State
------  -----
11722   terminated
Node    State           Error Description (if any)
----    -----           --------------------------
griffon-9.nancy.grid5000.fr     deployed
griffon-18.nancy.grid5000.fr    deployed
griffon-8.nancy.grid5000.fr     deployed
griffon-17.nancy.grid5000.fr    deployed
Sumary:
first reboot and check: 80
preinstall:149
transfert: 72
last reboot and check: 78
[410] Kadeploy on griffon finished.
[410] All concurrent kadeploys completed, testing nodes.
### Nodes deployed: griffon-17.nancy.grid5000.fr griffon-18.nancy.grid5000.fr griffon-8.nancy.grid5000.fr griffon-9.nancy.grid5000.fr
### Nodes which failed: 
### [411] Had to run 1 kadeploy runs, deployed 4 nodes.
### Good nodes (4) are in $GOOD_NODES (=/tmp/katapult.good.qA8597).
### Bad nodes (0) are in $BAD_NODES (=/tmp/katapult.bad.gt8598).
### [411] Job finished.

So this means that you have 4 nodes with the environment lenny2.6.26-OAR2.4.0 installed...


OAR cluster Configuration

VIII) At this point you can configure OAR so that the above 4 nodes become your own small OAR cluster...to do this you do :

sort -u $OAR_FILE_NODES | tail -3 | ssh -l g5k $(sort -u $OAR_NODEFILE | head -1) "xargs ~g5k/launch_OAR.sh"

when it finishes the first node will be the cluster server and the rest 3 nodes will be the computing nodes of your personal cluster...

IX)Then you will connect to the first node of the 4nodes that you deployed (which will be the OAR server):

ssh -l g5k $(sort -u $OAR_FILE_NODES | head -1)

the password is grid5000 and you will be as user g5k the user root has password grid5000 as well.

X)Normally if you arrived here everything worked just fine and you are ready to use your 4node cluster with OAR !....

 To verify that everything worked fine you can do
oarnodes

to verify that all resources are declared as expected and

oarsub -I 

to launch an Interactive job.

Personal tools