This is an old revision of the document!


Serveral installations of OAR are known to have reached more than several millions of jobs. With the corresponding size of the OAR database, some problems may arise, first of all depending on the sizing of the database server in use.

The main CIMENT cluster for instance (froggy) currently runs jobs with id > 11.000.000, with not known issues. CIMENT uses a PostreSQL database with a dedicated server (not a VM) with the follwing hardware specs:

However, some other installations (probably with a less powerful server) are known to have required some maintenance in order to keep OAR fully responsive.

Therefore, this page gathers some know-how when it comes to shrink a OAR database which is becoming too big.

Option 1: reset the database

The solution here is to create a fresh new database for OAR, keeping just the structure of the OAR installation (first of all the definition of the resources), but no job data.

As a result, the counter of job ids will reset to 1.

This solution is fairly easy, but with the drawback of breaking history (e.g. job dependency if any), and forcing to stop running jobs and emptying queues (i.e. breaking the continuity of service).

It involves to following steps:

  • initialize a new database using the oar-database tool
    • typically with the same credentials as the former one
    • typically with a new database name
  • inject structural data from the former database to the new one
    • resources properties: columns of the resources table
    • resources definitions: data (rows) of the resources table
    • admission rules: data (rows) of the admission_rules table
  • make sure the cluster is empty of running jobs (you may use oarnodesetting –drain command)
  • empty the queues (waiting jobs): ask users to wait for the new database to be in service before submitting new jobs, an dedicated admission rule may help here)
  • stop the oar-server
  • change the database in use, in oar.conf (server and frontend)
  • restart the oar server

Option 2: drop information regarding old jobs

TBC


Please feel free to contribute to this page by reporting remarks to the oar-users@ mailing list.

wiki/how_to_handle_a_bigger_and_bigger_oar_database.1467202661.txt.gz · Last modified: 2016/06/29 14:17 by neyron
Recent changes RSS feed GNU Free Documentation License 1.3 Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki