Wednesday, November 3, 2010

Java Batch Updates


  Statements

The batch update facility allows a Statement object to submit a set of heterogeneous update commands together as a single unit, or batch, to the underlying DBMS. In the example below all of the update operations required to insert a new employee into a fictitious company database are submitted as a single batch.

// turn off autocommit
con.setAutoCommit(false);

Statement stmt = con.createStatement();

stmt.addBatch("INSERT INTO employees VALUES (1000, 'Joe Jones')");
stmt.addBatch("INSERT INTO departments VALUES (260, 'Shoe')");
stmt.addBatch("INSERT INTO emp_dept VALUES (1000, 260)");

// submit a batch of update commands for execution
int[] updateCounts = stmt.executeBatch();


In the example, autocommit mode is disabled to prevent the driver from committing the transaction when Statement.executeBatch() is called. Disabling autocommit allows an application to decide whether or not to commit the transaction in the event that an error occurs and some of the commands in a batch cannot be processed successfully. For this reason, autocommit should always be turned off when batch updates are done. The commit behavior of executeBatch is always implementation defined when an error occurs and autocommit is true.

To keep our discussion of batch updates general, we define the term element to refer to an individual member of a batch. As we have seen, an element in a batch is just a simple command when a Statement object is being used. Although we are focusing on using Statement objects to do batch updates in this section, the discussion that follows applies to PreparedStatment and CallableStatement objects, as well.

In the new JDBC API, a Statement object has the ability to keep track of a list of commands -or batch-that can be submitted together for execution. When a Statement object is created, its associated batch is empty-the batch contains no elements. The Statement.addBatch() method adds an element to the calling statement's batch. The method Statement.clearBatch() (not shown above) can be called to reset a batch if the application decides not to submit a batch of commands that has been constructed for a statement.

Successful execution

The Statement.executeBatch() method submits a statement's batch to the underlying data source for execution. Batch elements are executed serially (at least logically) in the order in which they were added to the batch. When all of the elements in a batch execute successfully, executeBatch() returns an integer array containing one entry for each element in the batch. The entries in the array are ordered according to the order in which the elements were processed (which, again, is the same as the order in which the elements were originally added to the batch). An entry in the array may have the following values:

  1. If the value of an array entry is greater than or equal to zero, then the batch element was processed successfully and the value is an update count indicating the number of rows in the database that were effected by the element's execution.
  2. A value of -2 indicates that a element was processed successfully, but that the number of effected rows is unknown.
Calling executeBatch() closes the calling Statement object's current result set if one is open. The statement's internal list of batch elements is reset to empty once executeBatch() returns. The behavior of the executeQuery, executeUpdate, or execute methods is implementation defined when a statement's batch is non-empty.

ExecuteBatch() throws a BatchUpdateException if any of the elements in the batch fail to execute properly, or if an element attempts to return a result set. Only DDL and DML commands that return a simple update count may be executed as part of a batch. When a BatchUpdateException is thrown, the BatchUpdateException.getUpdateCounts() method can be called to obtain an integer array of update counts that describes the outcome of the batch execution.

Handling failures during execution

A JDBC driver may or may not continue processing the remaining elements in a batch once execution of an element in a batch fails. However, a JDBC driver must always provide the same behavior when used with a particular DBMS. For example, a driver cannot continue processing after a failure for one batch, and not continue processing for another batch.

If a driver stops processing after the first failure, the array returned by BatchUpdateException.getUpdateCounts() will always contain fewer entries than there were elements in the batch. Since elements are executed in the order that they are added to the batch, if the array contains N elements, this means that the first N elements in the batch were processed successfully when executeBatch() was called.

When a driver continues processing in the presence of failures, the number of elements, N, in the array returned by BatchUpdateException.getUpdateCounts()is always equal to the number of elements in the batch. The following additional array value is returned when a BatchUpdateException is thrown and the driver continues processing after a failure:

  1. A value of -3 indicates that the command or element failed to execute successfully. This value is also returned for elements that could not be processed for some reason-such elements fail implicitly.
JDBC drivers that do not continue processing after a failure never return -3 in an update count array. Drivers of this type simply return a status array containing an entry for each command that was processed successfully.

A JDBC technology based application can distinguish a JDBC driver that continues processing after a failure from one that does not by examining the size of the array returned by BatchUpdateException.getUpdateCounts(). A JDBC driver that continues processing always returns an array containing one entry for each element in the batch. A JDBC driver that does not continue processing after a failure will always return an array whose number of entries is less than the number of elements in the batch.

  PreparedStatements

An element in a batch consists of a parameterized command and an associated set of parameters when a PreparedStatement is used. The batch update facility is used with a PreparedStatement to associate multiple sets of input parameter values with a single PreparedStatement object. The sets of parameter values together with their associated parameterized update command can then be sent to the underlying DBMS engine for execution as a single unit.

The example below inserts two new employee records into a database as a single batch. The PreparedStatement.setXXX() methods are used to create each parameter set (one for each employee), while the PreparedStatement.addBatch() method adds a set of parameters to the current batch.

// turn off autocommit
con.setAutoCommit(false);

PreparedStatement stmt = con.prepareStatement(
"INSERT INTO employees VALUES (?, ?)");

stmt.setInt(1, 2000);
stmt.setString(2, "Kelly Kaufmann");
stmt.addBatch();

stmt.setInt(1, 3000);
stmt.setString(2, "Bill Barnes");
stmt.addBatch();

// submit the batch for execution
int[] updateCounts = stmt.executeBatch();


Finally, PreparedStatement.executeBatch() is called to submit the updates to the DBMS. Calling PreparedStatement.executeBatch() clears the statement's associated list of batch elements. The array returned by PreparedStatement.executeBatch() contains an element for each set of parameters in the batch, similar to the case for Statement. Each element either contains an update count or the generic `success' indicator (-2).

Error handling in the case of PreparedStatement objects is the same as error handling in the case of Statement objects. Some drivers may stop processing as soon as an error occurs, while others may continue processing the rest of the batch. As for Statement, the number of elements in the array returned by BatchUpdateException.getUpdateCounts() indicates whether or not the driver continues processing after a failure. The same three array element values are possible, as for Statement. The order of the entries in the array is the same order as the order in which elements were added to the batch.

  Callable Statements

The batch update facility works the same with CallableStatement objects as it does with PreparedStatement objects. Multiple sets of input parameter values may be associated with a callable statement and sent to the DBMS together. Stored procedures invoked using the batch update facility with a callable statement must return an update count, and may not have out or inout parameters. The CallableStatement.executeBatch() method should throw an exception if this restriction is violated. Error handling is analogous to PreparedStatement.


No comments:

Post a Comment