Sun Grid Engine set memory requirements per jobs
On our cluster we use h_vmem
to enforce job memory allocation.
The thing you appear to be missing is setting the available amount as a consumable resource. In qconf -mc
or in the qmon
complexes dialog you need to set the resource as requestable and consumable.
Then, on each host with qconf -me
you need to set the amount of available memory in complex_values
.
For example we have host definitions that look like:
hostname node004
load_scaling NONE
complex_values h_vmem=30.9G,exclusive=1,version=3.3.0
grs
Updated on September 17, 2022Comments
-
grs almost 2 years
I want to be able to set up memory requirements per job.
For instance: to run 5 jobs, for which I know each will need 4 GB memory. I have 16 GB RAM on the Ubuntu server and 16 GB swap. I want to avoid using the swap. Can I do something like:
qsub -l mem_required_for_my_job=4G job1 qsub -l mem_required_for_my_job=4G job2 qsub -l mem_required_for_my_job=4G job3 qsub -l mem_required_for_my_job=4G job4 qsub -l mem_required_for_my_job=4G job5
? The jobs will require 4 GB at some moment, but not in the beginning.
How to tell SGE what my requirements are? How to avoid scheduling 5 x 4 GB when only 16 GB available?
I read the user guide and tried
s_vmem, h_vmem, mem_free, mem_used
. None of them is what I want. I do not want my jobs to get killed in the middle of the processing. I want them not to be scheduled, unless the maximum resources needed are available.Can I do this?
Thank you all!
-
grs over 13 yearsPlease check the comment on the other answer.
-
grs over 13 yearsI tried with
h_vmem
,virtual_free
and quotas. None of them carry the task in the way I want it.h_vmem
protects the system, killing the jobs if they overcome their requirements. Bothquota
andvirtual_free
didn't do anything to prevent the memory exhausting. The OS killed the job in this case. I want to avoid killing the jobs. I would like to make them stay in the queue, waiting for their requested resources to be available. Is this possible? -
grs over 13 yearsExactly. I want to be able to allocate memory requirements per job, but do not want to kill the job if exceeds these requirements (for now). What I can't understand is how GE allocates memory. If I have 5 jobs x 4GB each in their peak, how many would run on 16 GB simultaneously, without swapping? If 4 jobs run and takes total of 10GB in their beginning, would the 5th one go in? How GE will know what is the expected memory usage peak per job?
-
Rahim over 13 yearsGridEngine will run exactly as many jobs as match the available complex resources. So if you define a node to have 16GB of h_vmem available, and you submit 5 jobs requesting 4GB, GridEngine will only place 4 of them on the node at once. Your job should always request the peak amount it expects to use. If you have swap space on your nodes, you can include that amount in the h_vmem complex value.
-
grs over 13 yearsGreat! What arguments should I use to specify my job requirements? I believe it should be something like
qsub -l mem_required=4G job1
. What I have to use after the-l
part? -
Rahim over 13 yearsThe arguments are the same as the name of the complex. So if you have
h_vmem
configured as a consumable resource and under a host's complex_values, your command would look like:qsub -l h_vmem=4G job1
. -
Rahim over 13 yearsAlso here's a quick blog post from gridengine.info regarding memory limits: gridengine.info/2009/12/01/…
-
grs over 13 yearsJust to complete the discussion: if I setup
s_vmem
andh_vmem
in my complex resources viaqconf -me hostname
, then I must pass both of them:qsub -l s_vmem=2G,h_vmem=3G job1
. If it is just one or none, the job just quits without any sign why. So-l <arg1>,<arg2>
become mandatory for everyone. Thanks!