Batch processing on Linux

6,137

Solution 1

I'd set up some kind of queueing service. A quick Google on "ready to use" stuff shows this:

Depending on your needs you could simply

  • create a wrapper where users submit jobs,
  • the wrapper writes the job to a socket/file/whatever
  • create a consumer that runs job by job waiting for it to finish
  • the consumer is then called regularly by cron (every 5 minutes or so)
    • of course create some locking mechanism so that only n jobs run at a time (where n=>1)
  • if there are no more jobs do nothing
  • if there are more jobs grab the next and wait for it to finish

Actually there's more to it, you could have requirements that implement a priority queue which brings up problems like starving jobs or similiar but it's not that bad to get something up and running quite fast.

If LDP as suggested by womble I'd take that. Having such a system maintained by a larger community is of course better than creating your own bugs for problems others already solved :)

Also the queuing service has the advantage of decoupling the resources from the actual number crunching. By making the jobs available over some network connection you can simply throw hardware at a (possible) scaling problem and have nearly endless scalability.

Solution 2

A heavy weight solution to your problem is to use a something like Sun Grid Engine.

Sun Grid Engine (SGE). SGE is a distributed resource management software and it allows the resources within the cluster/machine (cpu time,software, licenses etc) to be utilized effectively.

Here is a small tutorial on how to use SGE.

Solution 3

Two solutions spring to mind:

  1. Use xargs -P to control the maximum parallel processes at one time.
  2. Create a Makefile and spawn with make -j.

They are actually both summarised in this SO thread in more detail.

There is a possibility that these may not be applicable to the structure of your scripting.

Solution 4

You could check out some of the batch-systems used for scheduling jobs on clusters, which has the option to monitor resource usage and declare a system to be too loaded to dispatch more workload to it. You could easily also configure them to only run one job at a time, but for that you may be better off with something less complex than a full fledged batch scheduler (in the spirit of keeping things simple).

As for freely available batch/scheduling systems, the two that springs to mind would be OpenPBS/Torque and SGE.

Edited to add: If you're ever going to add more processing capacity in the future in the form of more boxes, a batch/scheduling system like Torque/OpenPBS/SGE may be good choices as they're basically built to manage compute resources and distribute workloads to them.

Solution 5

You can always use lpd -- yeah, old school, but it's really a generalised batch processing control system masquerading as a print server.

Share:
6,137

Related videos on Youtube

Andrew Williams
Author by

Andrew Williams

Updated on September 17, 2022

Comments

  • Andrew Williams
    Andrew Williams almost 2 years

    We're currently setting up a server to some heavy lifting (ETL) after another process has finished within the business, at the moment we're firing off jobs either via scheduled cron jobs or remote execution (via ssh). Early on this week we hit a issue with too many jobs running side by side on the system which brought all the jobs to a snail pace as they fought for CPU time.

    I've been looking for a batch scheduler, a system where we can insert jobs into a run queue and the system will process them one by one. Can anyone advise on a program/system to do this with? Low cost / FOSS would be appreciated due to the shoe-string nature of this project.

    • nik
      nik about 15 years
      There is a somewhat old but interesting article at linuxjournal.com/article/4087
    • Andrew Williams
      Andrew Williams about 15 years
      Yes, a nice article but limited to scheduling on a time basis, as I mentioned in the question we have time scheduled and jobs started at the end of a remote job, which could be any time. We aim to allow only one job to run at a time, and any extra that get triggered remotely or via cron, to go into a queue of jobs to be processed.
  • Andrew Williams
    Andrew Williams about 15 years
    Interesting idea, does any documentation exist on using it as a general batch processor?
  • Andrew Williams
    Andrew Williams about 15 years
    We do run jobs via Cron, but the issue we have is that we dont know how long jobs will run for, sometimes it'll be 12 million rows and 4 hours and the next 100k and 15 minutes.
  • pauska
    pauska about 15 years
    Ohh, sorry. I dint quite understand your scenario then. How about getting the initial process (the one you want to wait for before doing anything else) to write a status file? Application writes "WAIT" into statusfile when it starts up, and writes "OK" into file when its successfully done. Cron jobs starts up batchscript wich goes to exit 0 if file != OK.
  • Andrew Williams
    Andrew Williams about 15 years
    sqs is exactly what I was looking for. A simple system for queueing jobs up. Thanks for your help.
  • Andrew Williams
    Andrew Williams about 15 years
    Yes, I looked at batch but as you mentioned it executes on load average. This would be a issue due to the inital stages of our scripts which do a large DB extract, this is a low CPU but high network bandwidth task and it doesn't raise the loadavg above 0.3. As part of that criteria another job would be ran at the same time.
  • Vatine
    Vatine about 15 years
    I haven't tried it with lpd, but I have tried it with lpsched (the old SysV scheduler). There it's simple, as the "printer backends" are all shell scripts (by default). At a very, very previous job, we had a Rayshade "print queue" that rendered jobs and dumped the resulting images in user home directories.
  • idelvall
    idelvall over 7 years
    I the creator btw. Hope you find it useful