Table of Contents

The project

In the area of High Performance Computing (HPC) the evaluation of a Resource Management System (RMS) is a very complex task. A known idea is to take advantage of workload traces and provide a platform to replay past scenarios using different scheduling strategies, features and even different RMS software etc.

In the case of OAR Resource Management System the research group has developed a collection of tools in Ruby represented by the project xionee [1] to simplify the management of workload traces (DB collection, export in known workload formats [2], automatic resubmission, graphic visualization of execution scenarios, etc ). An other tool based on Xen and LVM automatically deploys OAR enabled virtual cluster for tests.

The goal of this project is the development of a complete simulating/emulating platform, based upon the above tools, that will serve as a testing infrastructure for RMS systems

Difficulty

Medium

Skills

Mentors

References

  1. “Parallel workload archives”: http://www.cs.huji.ac.il/labs/parallel/workload