Find your content:

Search form

You are here

Why use Batch Apex?

 
Share

What is the point of batch apex? I've been trying to do some research but everything I find just tells me how to use it or goes over scenarios where it might be useful - without telling me why I would want to do those things in the batchable context. I'm guessing the limits are different? Does that code run differently in SFDC?

I know there has to be some good reasons but I figured if I was having such a hard time figuring it out, others probably are as well.

Why would I want to run code in a batchable context? What are the advantages/disadvantages?


Attribution to: Ryan Elkins

Possible Suggestion/Solution #1

Elevator pitch for Batch Apex is that it is code which can run asynchronously in the background. It will basically scour through all of your data as it can, so like scheduled Apex it isn't a guaranteed delivery time.

Most common use case I've heard is for organizations with large amounts of data wanting an automatic data scrubber with some hardcore business logic associated with it. On another project we "weighted" various records, but on insert only weighted them to about 1000 records "above" and "below" the new record. Don't know if they ever added it, but after I left the project the plan was to "smooth" that weight across all records in the system using batch Apex.


Attribution to: joshbirk

Possible Suggestion/Solution #2

A Batch class allows you to define a single job that can be broken up into manageable chunks that will be processed separately.

One example is if you need to make a field update to every Account in your organization. If you have 10,001 Account records in your org, this is impossible without some way of breaking it up. So in the start() method, you define the query you're going to use in this batch context: 'select Id from Account'. Then the execute() method runs, but only receives a relatively short list of records (default 200). Within the execute(), everything runs in its own transactional context, which means almost all of the governor limits only apply to that block. Thus each time execute() is run, you are allowed 150 queries and 50,000 DML rows and so on. When that execute() is complete, a new one is instantiated with the next group of 200 Accounts, with a brand new set of governor limits. Finally the finish() method wraps up any loose ends as necessary, like sending a status email.

So your batch that runs against 10,000 Accounts will actually be run in 50 separate execute() transactions, each of which only has to deal with 200 Accounts. Governor limits still apply, but only to each transaction, along with a separate set of limits for the batch as a whole.

Disadvantages of batch processing:

  • It runs asynchronously, which can make it hard to troubleshoot without some coded debugging, logging, and persistent stateful reporting. It also means that it's queued to run, which may cause delays in starting.
  • There's a limit of 5 batches in play at any time, which makes it tricky to start batches from triggers unless you are checking limits.
  • If you need access within execute() to some large part of the full dataset being iterated, this is not available. Each execution only has access to whatever is passed to it, although you can persist class variables by implementing Database.stateful.
  • There is still a (fairly large) limit on total Heap size for the entire batch run, which means that some very complex logic may run over, and need to be broken into separate batches.

Attribution to: Jeremy Nottingham

Possible Suggestion/Solution #3

Maybe some drawbacks of Apex Batch can be part of an answer to you question...

Apex Batch is something like the last resort for Apex developers to circumvent limitations of the Salesforce Platform when working with "Large" data volumes. When using Batch as the asynch backbone of a bigger system you soon find obvious drawbacks:

  • Jobs are put in a queue, but when that queue is full (Max. 5 concurrent batches), the job fails instead of being scheduled for later processing.
  • Jobs of different Batches might work on the same data and produce conflicts. There is no locking mechanism or guaranteed order.
  • Poor support to handle party failed batch runs. Its really hard to find out where and why a single job failed and to restore data consistency.

I am currently trying to create an improved custom Apex queue as part of a GitHub project I would love to get your feedback and invite you to fork and collaborate on it.


Attribution to: Robert Sösemann
This content is remixed from stackoverflow or stackexchange. Please visit https://salesforce.stackexchange.com/questions/920

My Block Status

My Block Content