Find your content:

Search form

You are here

Viable design for multiple batch jobs

 
Share

I've recently run across multiple needs to execute multiple different batch jobs at the same time, and almost immediately hit the (somewhat inexplicable) SFDC limit of 5 batch jobs queued at once.

I've worked around this limit in a number of ways, none of which seem ideal to me.

  • returning a Iterable<SObject> from the start method, which contains a mix of all the SObjects I need to act upon, and then use dynamic inspection to figure out what I'm supposed to be doing on each row in the execute method. This seems fine for small jobs, but wouldn't presumably scale to millions of records due to heap size etc.
  • using a scheduled apex job that runs once, executes a batch, and then that batch in its finish method schedules another apex job (to run in say 1 minute) that executes the next batch in the chain, and so on.

Is there a more viable/flexible pattern that I could be using? What I really need is a flexible work queue where on-demand, not-very-time-sensitive batch jobs can be queued for future (serial) execution based on user activity. And that doesn't violate platform limits of batch jobs or scheduled jobs - really needs to contribute a max of 1 scheduled or batch job at any given point.

Not sure it will help, but the characteristics of the job set I'm envisioning are this:

  • each task executes on a single SObject type and does one job: e.g. setting unset fields, doing fake M-D rollups, etc.
  • some tasks are operating on a small subset of a table, others need to walk over every Contact/Lead/etc. with potentially millions of rows per batch job.
  • while some tasks are known to be needed every N hours / days, most of them are needed only after some user-driven operation: e.g. click a big red button in an admin console, do something to more than N records, or a new/updated record of a certain type requires its downstream records to be operated upon.

Attribution to: jkraybill

Possible Suggestion/Solution #1

I've run into this problem a couple of times. My approach has been to process a large number of records in batches. For example, you could structure your queries to return groups of states (AL, AR, AZ etc.) or you could assign a sequence number to every record and process the batches by sequence number.

The most difficult part of this may be not exceeding your 5 job limit. You'll need to abort cron job entries to keep the queue short. I query on recently created jobs...

 Datetime currentTime = datetime.now();
  Datetime lasTime = currentTime.addHours(-1);
  List<CronTrigger> ct = [SELECT Id, CreatedDate, CronExpression FROM CronTrigger
                          WHERE CreatedDate > :lasTime
                          ORDER by CreatedDate Asc];

Attribution to: Vin D'Amico

Possible Suggestion/Solution #2

I'd suggest you have a quick Scheduled job that runs 2-3 times a day and checks status of currently running batches and then schedule further batches...

basic building block, a schedulable batch job for the separate processing you need


Attribution to: Vid L

Possible Suggestion/Solution #3

There are some tried and tested patterns for doing this, but if you'd like to save yourself some work and get extra functionality at the same time, you might want to check out Skoodat Relax. Skoodat have built a package up on the platform specifically for scheduling batches with as much power as possible and it sounds like a good match for your needs.

Otherwise, the limit of five jobs is being increased, and you can take the approach you've suggested. Depending on the exact details of the job you're running, you may be able to leverage triggers to do the processing on the other objects and just run the batch on your main record set.


Attribution to: Matt Lacey

Possible Suggestion/Solution #4

You could also create you own custom apex batch queue. I did this myself and made it public as the GitHub project SObject Work Queue. I had the following design goals:

  • Must prevent Max 5 batch in parallel limit - We should never run into this limit with work that is processed over the queue.
  • The queue needs to be so generic that "work" on any type of database object needs to be enqueued.
  • Any type of modification of database objects need to be possible. This should be transparently handled by the queue.
  • Provides better error diagnostics like Batch. Knows last successful Id, full stacktrace and sends email to developers.
  • Secures data integrity like Batch or better. Failures should not leave data in inconsistent state or user of the infrastructure should be able to handle them.
  • Optimistic locking : Instead of locking all many records we do not to process work on Ids that have other work already scheduled.
  • Work that can be run synchronously, should not be queued and processed asynch.

Maybe you want to check if that could be used in you case or collaborate on GitHub to extend it for your purpose.

Here is an overview sequence diagram that shows how work is defined, enqueued and processed in such a custom Apex Queue:

enter image description here


Attribution to: Robert Sösemann
This content is remixed from stackoverflow or stackexchange. Please visit https://salesforce.stackexchange.com/questions/159

My Block Status

My Block Content