High performance Batch operations

While EzQu's standard db.insert(), db.update(), and db.delete() methods (and their 'all' counterpart methods insertAll, updateAll, deleteAll) are excellent for managing complex object graphsand relationships, they execute individual SQL statements for each operation. When you need to process thousands of records efficiently, executing them one by one becomes a performance bottleneck.

For high-performance bulk operations, EzQu provides the DbUtils class. DbUtils leverages native JDBC batching to send multiple operations to the database in a single network trip.

Important Constraints for DbUtils

  1. Relationships are Ignored: Batch operations bypass EzQu's standard relationship management. Annotations like @One2Many or @Many2One are disregarded. The framework only persists the immediate fields of the object. Therefore it is highly recommended to only use simple pojo objects.
  2. Primary Keys are Mandatory: To use batch functions, your POJO must have a defined primary key
  3. Full Field Traversal: Batch INSERT and UPDATE statements will always include all mapped fields of the entity. Batch DELETE statements will rely strictly on the primary key.

You can obtain an instance of DbUtils directly from an active Db session using the db.utils() method.

Quick Batch Inserts (insertBatch)

If you have a collection of objects that need to be inserted, the easiest method is insertBatch(). You define a batchSize (the number of records to group together before flushing to the database) and pass in the array of objects.

sessionFactory.runInSession(db -> {
    DbUtils utils = db.utils();

    Person p1 = new Person(1L, "Alice", "Smith");
    Person p2 = new Person(2L, "Bob", "Jones");
    Person p3 = new Person(3L, "Charlie", "Brown");

    // Execute the batch insert, flushing to the DB in chunks of 50
    int[] updateCounts = utils.insertBatch(50, p1, p2, p3);

    db.commit(); // Don't forget to commit if you are managing transactions manually!
});

Granular Batch Processing

For more control, or if you need to perform bulk updates and deletes, you can construct batches manually using addBatch() and executeBatch().

EzQu automatically creates and caches the underlying PreparedStatement for your entity type and statement type (INSERT, UPDATE, or DELETE) within the current session.

sessionFactory.runInSession(db -> {
    DbUtils utils = db.utils();

    // 1. Add operations to the batch queue
    for (Person p : myLargeListOfPeopleToUpdate) {
        p.setStatus("ARCHIVED");
        // Enqueue an UPDATE operation
        utils.addBatch(p, StatementType.UPDATE);
    }

    // 2. Submit the batch to the database
    // Note: You must specify the class and the StatementType to execute
    int[] updateCounts = utils.executeBatch(Person.class, StatementType.UPDATE);

    db.commit(); // Don't forget to commit if you are managing transactions manually!
});

You can also use utils.executeBatch(Person.class) without specifying the statement type to submit all queued inserts, updates, and deletes for that class at once.

Handling Batch Errors

When a batch operation fails (for instance, due to a constraint violation), EzQu analyzes the underlying JDBC BatchUpdateException and identifies the error rows. It then encapsulates the result in a result array and marks those rows with a special int code. (-100).

Understanding the Result Array: The method returns an int[] where each element corresponds to the command executed. The values indicate:

  • value > 0: The command was processed successfully, and this number of rows was changed in the database.
  • 0: The command was processed successfully, but no changes occurred.
  • -100: Indicates a technical failure on a specific row during the batch iteration.
  • -3: (Use Statement.EXECUTE_FAILED), The command failed to execute.
  • -2: (Use Statement.SUCCESS_NO_INFO), The command was processed successfully, but the exact number of affected rows is unknown.

Here is an example of batch result handling:

// here we have finished our batching without any db issues
// however we need to check the results array to make sure all rows were inserted
for (int rowNum = 0; rowNum < updateCounts.length; rowNum++) {
    // process count
    switch (updateCounts[rowNum]) {
        case Statement.EXECUTE_FAILED, 0, -100:
            // handle failed update with index i
            ...handle errors;
            failed++;
            if (updateCounts[rowNum] == -100)
                logger.error("error report row {} failed on technical issue", rowNum);
            break;
        default:
            // success
            success++;
            break;
    }
}