English

Google App Engine

Transactions

The App Engine Datastore supports transactions. A transaction is an operation or set of operations that is atomic—either all of the operations in the transaction occur, or none of them occur. An application can perform multiple operations and calculations in a single transaction.

Using Transactions

A transaction is a set of Datastore operations on one or more entity. Each transaction is guaranteed to be atomic, which means that transactions are never partially applied. Either all of the operations in the transaction are applied, or none of them are applied.

An operation may fail when:

  • Too many users try to modify an entity group simultaneously.
  • The application reaches a resource limit.
  • The Datastore encounters an internal error.

In all cases, the Datastore API raises an exception.

Note: If your app receives an exception when submitting a transaction, it does not always mean that the transaction failed. You can receive Timeout, TransactionFailedError, or InternalError exceptions in cases where transactions have been committed and eventually will be applied successfully. Whenever possible, make your Datastore transactions idempotent so that if you repeat a transaction, the end result will be the same.

Transactions are an optional feature of the Datastore; you're not required to use transactions to perform Datastore operations.

An application can execute a set of statements and Datastore operations in a single transaction, such that if any statement or operation raises an exception, none of the Datastore operations in the set are applied. The application defines the actions to perform in the transaction using a Python function. The application starts the transaction using one of the run_in_transaction methods, depending on whether the transaction accesses entities within a single entity group or whether the transaction is a cross-group (XG) transaction.(For XG transactions, see Using Cross-Group Transactions.) For transactions on entities within a single entity group, an app calls db.run_in_transaction() with the function as an argument:

from google.appengine.ext import db

class Accumulator(db.Model):
    counter = db.IntegerProperty(default=0)

def increment_counter(key, amount):
    obj = db.get(key)
    obj.counter += amount
    obj.put()

q = db.GqlQuery("SELECT * FROM Accumulator")
acc = q.get()

db.run_in_transaction(increment_counter, acc.key(), 5)

db.run_in_transaction() takes the function object, and positional and keyword arguments to pass to the function. If the function returns a value, db.run_in_transaction() returns that value.

If the function returns, the transaction is committed, and all effects of Datastore operations are applied. If the function raises an exception, the transaction is "rolled back," and the effects are not applied. See the note above about exceptions.

Using Cross-Group (XG) Transactions

The XG transactions, which operate across multiple entity groups, behave similarly as the single-group transaction described above. The main difference is the use of the transaction options as shown in the following snippet:

from google.appengine.ext import db

xg_on = db.create_transaction_options(xg=True)

def my_txn():
    x = MyModel(a=3)
    x.put()
    y = MyModel(a=7)
    y.put()

db.run_in_transaction_options(xg_on, my_txn)

As shown above, to perform an XG transaction, create a transactions options object with the xg parameter set to true: xg_on = db.create_transaction_options(xg=true). Then, run your transaction with db.run_in_transaction_options(xg_on, my_txn).

What Can Be Done In a Transaction

The Datastore imposes several restrictions on what can be done inside a single transaction.

All Datastore operations in a transaction must operate on entities in the same entity group if the transaction is a single group transaction, or on entities in a maximum of five entity groups if the transaction is a cross-group (XG) transaction. This includes querying for entities by ancestor, retrieving entities by key, updating entities, and deleting entities. Notice that each root entity belongs to a separate entity group, so a single transaction cannot create or operate on more than one root entity unless it is an XG transaction.

When two or more transactions simultaneously attempt to modify entities in one or more common entity groups, only one of those transactions can succeed. While an app is applying changes to entities in one or more entity group, all other attempts to update any entity in the group or groups fail at commit time. Because of this design, using entity groups limits the number of concurrent writes you can do on any entity in the groups. When a transaction starts, App Engine uses optimistic concurrency control by checking the last update time for the entity groups used in the transaction. Upon commiting a transaction for the entity groups, App Engine again checks the last update time for the entity groups used in the transaction. If it has changed since our initial check, App Engine throws an exception. For an explanation of entity groups, see Keys and Entity Groups.

An app can perform a query during a transaction, but only if it includes an ancestor filter. (You can actually perform a query without an ancestor filter, but the results won't reflect any particular transactionally consistent state). An app can also get Datastore entities by key during a transaction. You can prepare keys prior to the transaction, or you can build keys inside the transaction with key names or IDs.

Isolation and Consistency

The Datastore's isolation level outside of transactions is closest to READ_COMMITTED. Inside transactions, on the other hand, the isolation level is SERIALIZABLE, specifically a form of snapshot isolation. See the Transaction Isolation article for more information on isolation levels.

In a transaction, all reads reflect the current, consistent state of the Datastore at the time the transaction started. This does not include previous puts and deletes inside the transaction. Queries and gets inside a transaction are guaranteed to see a single, consistent snapshot of the Datastore as of the beginning of the transaction. Entities and index rows in the transaction's entity group are fully updated so that queries return the complete, correct set of result entities, without the false positives or false negatives described in Transaction Isolation that can occur in queries outside of transactions.

This consistent snapshot view also extends to reads after writes inside transactions. Unlike with most databases, queries and gets inside a Datastore transaction do not see the results of previous writes inside that transaction. Specifically, if an entity is modified or deleted within a transaction, a query or get returns the original version of the entity as of the beginning of the transaction, or nothing if the entity did not exist then.

Uses for Transactions

This example demonstrates one use of transactions: updating an entity with a new property value relative to its current value.

def increment_counter(key, amount):
    obj = db.get(key)
    obj.counter += amount
    obj.put()

Warning! The above sample depicts transactionally incrementing a counter only for the sake of simplicity. If your app has counters that are updated frequently, you should not increment them transactionally, or even within a single entity. A best practice for working with counters is to use a technique known as counter-sharding.

This requires a transaction because the value may be updated by another user after this code fetches the object, but before it saves the modified object. Without a transaction, the user's request uses the value of count prior to the other user's update, and the save overwrites the new value. With a transaction, the application is told about the other user's update. If the entity is updated during the transaction, then the transaction is retried until all steps are completed without interruption.

Another common use for transactions is to update an entity with a named key, or create it if it doesn't yet exist:

class SalesAccount(db.Model):
    address = db.PostalAddressProperty()
    phone_number = db.PhoneNumberProperty()

def create_or_update(parent_key, account_id, address, phone_number):
    obj = db.get(db.Key.from_path("SalesAccount", account_id, parent=parent_key))
    if not obj:
        obj = SalesAccount(key_name=account_id,
                           parent=parent_key,
                           address=address,
                           phone_number=phone_number)
    else:
        obj.address = address
        obj.phone_number = phone_number

    obj.put()

As before, a transaction is necessary to handle the case where another user is attempting to create or update an entity with the same string ID. Without a transaction, if the entity does not exist and two users attempt to create it, the second overwrites the first without knowing that it happened. With a transaction, the second attempt retries, notices that the entity now exists, and updates the entity instead.

When a transaction fails, you can have your app retry the transaction until it succeeds, or you can let your users deal with the error by propagating it to your app's user interface level. You do not have to create a retry loop around every transaction.

Create-or-update is so useful that there is a built-in method for it: Model.get_or_insert() takes a key name, an optional parent, and arguments to pass to the model constructor if an entity of that name and path does not exist. The get attempt and the create happen in one transaction, so (if the transaction is successful) the method always returns a model instance that represents an actual entity.

Tip: A transaction should happen as quickly as possible to reduce the likelihood that the entities used by the transaction will change, causing the transaction to fail. As much as possible, prepare data outside of the transaction, then execute the transaction to perform Datastore operations that depend on a consistent state. The application should prepare keys for objects used outside the transaction, then fetch the entities inside the transaction.

Finally, you can use a transaction to read a consistent snapshot of the Datastore. This can be useful when multiple reads are needed to render a page or export data that must be consistent. These kinds of transactions are often called read-only transactions, since they perform no writes. Read-only single-group transactions never fail due to concurrent modifications, so you don't have to implement retries upon failure. However, XG transactions can fail due to concurrent modifications, so these should have retries. Committing and rolling back a read-only transaction are both no-ops.

class Customer(db.Model):
    user = db.UserProperty()

class Account(db.Model):
    """An Account has a Customer as its parent."""
    address = db.PostalAddressProperty()
    balance = db.FloatProperty()

def get_all_accounts():
    """Returns a consistent view of the current user's accounts."""
    accounts = []
    for customer in Customer.all().filter('user =', users.get_current_user()):
        accounts.extend(Account.all().ancestor(customer))
    return accounts

Transactional Task Enqueuing

You can enqueue a task as part of a Datastore transaction, such that the task is only enqueued—and guaranteed to be enqueued—if the transaction is committed successfully. If the transaction does not get committed, the task is guaranteed not to be enqueued. If the transaction does get committed, the task is guaranteed to be enqueued. Once enqueued, the task is not guaranteed to execute immediately, so the task is not atomic with the transaction. Still, once enqueued, the task will retry until it succeeds. This applies to any task enqueued during a run_in_transaction() function.

Transactional tasks are useful because they allow you to combine non-Datastore actions to a transaction that depend on the transaction succeeding (such as sending an email to confirm a purchase). You can also tie Datastore actions to the transaction, such as to commit changes to entity groups outside of the transaction if and only if the transaction succeeds.

An application cannot insert more than five transactional tasks into task queues during a single transaction. Transactional tasks must not have user-specified names.

def do_something_in_transaction(...)
    taskqueue.add(url='/path/to/my/worker', transactional=True)
  ...

db.run_in_transaction(do_something_in_transaction, ....)