SQLAlchemy 0.3 Documentation

Multiple Pages | One Page
Version: 0.3.8 Last Updated: 06/02/07 17:01:58

A database engine is a subclass of sqlalchemy.sql.Engine, and is the starting point for where SQLAlchemy provides a layer of abstraction on top of the various DBAPI2 database modules. For all databases supported by SA, there is a specific "implementation" module, found in the sqlalchemy.databases package, that provides all the objects an Engine needs in order to perform its job. A typical user of SQLAlchemy never needs to deal with these modules directly. For many purposes, the only knowledge that's needed is how to create an Engine for a particular connection URL. When dealing with direct execution of SQL statements, one would also be aware of Result, Connection, and Transaction objects. The primary public facing objects are:

Underneath the public-facing API of Engine, several components are provided by database implementations to provide the full behavior, including:

Supported Databases

Engines exist for SQLite, Postgres, MySQL, MS-SQL, Firebird, Informix, and Oracle. For each engine, the appropriate DBAPI drivers must be installed separately. A distinct Python module exists in the sqlalchemy.databases package for each type of database as well, which provides implementations of some of the objects mentioned in the previous section.

Downloads for each DBAPI at the time of this writing are as follows:

The SQLAlchemy Wiki contains a page of database notes, describing whatever quirks and behaviors have been observed. Its a good place to check for issues with specific databases. Database Notes

back to section top

Establishing a Database Engine

SQLAlchemy indicates the source of an Engine strictly via RFC-1738 style URLs, combined with optional keyword arguments to specify options for the Engine. The form of the URL is:

$ driver://username:password@host:port/database

Available drivernames are sqlite, mysql, postgres, oracle, mssql, and firebird. For sqlite, the database name is the filename to connect to, or the special name ":memory:" which indicates an in-memory database. The URL is typically sent as a string to the create_engine() function:

# postgres
pg_db = create_engine('postgres://scott:tiger@localhost:5432/mydatabase')

# sqlite (note the four slashes for an absolute path)
sqlite_db = create_engine('sqlite:////absolute/path/to/database.txt')
sqlite_db = create_engine('sqlite:///relative/path/to/database.txt')
sqlite_db = create_engine('sqlite://')  # in-memory database

# mysql
mysql_db = create_engine('mysql://localhost/foo')

# oracle via TNS name
oracle_db = create_engine('oracle://scott:tiger@dsn')

# oracle will feed host/port/SID into cx_oracle.makedsn
oracle_db = create_engine('oracle://scott:tiger@127.0.0.1:1521/sidname')

The Engine will ask the connection pool for a connection when a SQL statement is executed. The default connection pool, QueuePool as well as the default SQLite connection pool SingletonThreadPool, will open connections to the database on an as-needed basis. As concurrent statements are executed, QueuePool will grow its pool of connections to a default size of five, and will allow a default "overflow" of ten. Since the Engine is essentially "home base" for the connection pool, it follows that you should keep a single Engine per database established within an application, rather than creating a new one for each connection.

Custom DBAPI keyword arguments

Custom arguments can be passed to the underlying DBAPI in three ways. String-based arguments can be passed directly from the URL string as query arguments:

db = create_engine('postgres://scott:tiger@localhost/test?argument1=foo&argument2=bar')

If SQLAlchemy's database connector is aware of a particular query argument, it may convert its type from string to its proper type.

create_engine also takes an argument connect_args which is an additional dictionary that will be passed to connect(). This can be used when arguments of a type other than string are required, and SQLAlchemy's database connector has no type conversion logic present for that parameter:

db = create_engine('postgres://scott:tiger@localhost/test', connect_args = {'argument1':17, 'argument2':'bar'})

The most customizable connection method of all is to pass a creator argument, which specifies a callable that returns a DBAPI connection:

def connect():
    return psycopg.connect(user='scott', host='localhost')

db = create_engine('postgres://', creator=connect)
back to section top

Database Engine Options

Keyword options can also be specified to create_engine(), following the string URL as follows:

db = create_engine('postgres://...', encoding='latin1', echo=True)

A list of all standard options, as well as several that are used by particular database dialects, is as follows:

back to section top

Configuring Logging

As of the 0.3 series of SQLAlchemy, Python's standard logging module is used to implement informational and debug log output. This allows SQLAlchemy's logging to integrate in a standard way with other applications and libraries. The echo and echo_pool flags that are present on create_engine(), as well as the echo_uow flag used on Session, all interact with regular loggers.

This section assumes familiarity with the above linked logging module. All logging performed by SQLAlchemy exists underneath the sqlalchemy namespace, as used by logging.getLogger('sqlalchemy'). When logging has been configured (i.e. such as via logging.basicConfig()), the general namespace of SA loggers that can be turned on is as follows:

For example, to log SQL queries as well as unit of work debugging:

import logging

logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)
logging.getLogger('sqlalchemy.orm.unitofwork').setLevel(logging.DEBUG)

By default, the log level is set to logging.ERROR within the entire sqlalchemy namespace so that no log operations occur, even within an application that has logging enabled otherwise.

The echo flags present as keyword arguments to create_engine() and others as well as the echo property on Engine, when set to True, will first attempt to ensure that logging is enabled. Unfortunately, the logging module provides no way of determining if output has already been configured (note we are referring to if a logging configuration has been set up, not just that the logging level is set). For this reason, any echo=True flags will result in a call to logging.basicConfig() using sys.stdout as the destination. It also sets up a default format using the level name, timestamp, and logger name. Note that this configuration has the affect of being configured in addition to any existing logger configurations. Therefore, when using Python logging, ensure all echo flags are set to False at all times, to avoid getting duplicate log lines.

back to section top

Using Connections

In this section we describe the SQL execution interface available from an Engine instance. Note that when using the Object Relational Mapper (ORM) as well as when dealing with with "bound" metadata objects, SQLAlchemy deals with the Engine and Connections for you and you generally don't need to know much about it; in those cases, you can skip this section and go to Database Meta Data. "Bound" metadata is described in Binding MetaData to an Engine.

The Engine provides a connect() method which returns a Connection object. Connection is a proxy object which maintains a reference to a DBAPI connection instance. This object provides methods by which literal SQL text as well as SQL clause constructs can be compiled and executed.

engine = create_engine('sqlite:///:memory:')
connection = engine.connect()
result = connection.execute("select * from mytable where col1=:col1", col1=5)
for row in result:
    print row['col1'], row['col2']
connection.close()

The close method on Connection does not actually remove the underlying connection to the database, but rather indicates that the underlying resources can be returned to the connection pool. When using the connect() method, the DBAPI connection referenced by the Connection object is not referenced anywhere else.

In both execution styles above, the Connection object will also automatically return its resources to the connection pool when the object is garbage collected, i.e. its __del__() method is called. When using the standard C implementation of Python, this method is usually called immediately as soon as the object is dereferenced. With other Python implementations such as Jython, this is not so guaranteed.

The execute method on Engine and Connection can also receive SQL clause constructs as well, which are described in Constructing SQL Queries via Python Expressions:

connection = engine.connect()
result = connection.execute(select([table1], table1.c.col1==5))
for row in result:
    print row['col1'], row['col2']
connection.close()

Both Connection and Engine fulfill an interface known as Connectable which specifies common functionality between the two objects, such as getting a Connection and executing queries. Therefore, most SQLAlchemy functions which take an Engine as a parameter with which to execute SQL will also accept a Connection (and the name of the argument is typically called connectable):

Specify Engine or Connection
engine = create_engine('sqlite:///:memory:')

# specify some Table metadata
metadata = MetaData()
table = Table('sometable', metadata, Column('col1', Integer))

# create the table with the Engine
table.create(connectable=engine)

# drop the table with a Connection off the Engine
connection = engine.connect()
table.drop(connectable=connection)

Connection facts:

back to section top

Transactions

The Connection object provides a begin() method which returns a Transaction object. This object is usually used within a try/except clause so that it is guaranteed to rollback() or commit():

trans = connection.begin()
try:
    r1 = connection.execute(table1.select())
    connection.execute(table1.insert(), col1=7, col2='this is some data')
    trans.commit()
except:
    trans.rollback()
    raise

The Transaction object also handles "nested" behavior by keeping track of the outermost begin/commit pair. In this example, two functions both issue a transaction on a Connection, but only the outermost Transaction object actually takes effect when it is committed.

# method_a starts a transaction and calls method_b
def method_a(connection):
    trans = connection.begin() # open a transaction
    try:
        method_b(connection)
        trans.commit()  # transaction is committed here
    except:
        trans.rollback() # this rolls back the transaction unconditionally
        raise

# method_b also starts a transaction
def method_b(connection):
    trans = connection.begin() # open a transaction - this runs in the context of method_a's transaction
    try:
        connection.execute("insert into mytable values ('bat', 'lala')")
        connection.execute(mytable.insert(), col1='bat', col2='lala')
        trans.commit()  # transaction is not committed yet
    except:
        trans.rollback() # this rolls back the transaction unconditionally
        raise

# open a Connection and call method_a
conn = engine.connect()                
method_a(conn)
conn.close()

Above, method_a is called first, which calls connection.begin(). Then it calls method_b. When method_b calls connection.begin(), it just increments a counter that is decremented when it calls commit(). If either method_a or method_b calls rollback(), the whole transaction is rolled back. The transaction is not committed until method_a calls the commit() method.

Note that SQLAlchemy's Object Relational Mapper also provides a way to control transaction scope at a higher level; this is described in SessionTransaction.

Transaction Facts:

back to section top

Implicit Execution

Implicit execution refers to the execution of SQL without the explicit usage of a Connection object. This occurs when you call the execute() method off of an Engine object or off of a SQL expression or table that is associated with "bound" metadata.

Implicit Execution Using Engine
engine = create_engine('sqlite:///:memory:')
result = engine.execute("select * from mytable where col1=:col1", col1=5)
for row in result:
    print row['col1'], row['col2']
result.close()

Using "bound" metadata:

Implicit Execution Using Engine-Bound SQL Construct
engine = create_engine('sqlite:///:memory:')
meta = BoundMetaData(engine)
table = Table('mytable', meta, Column('col1', Integer), Column('col2', String(20)))
r = table.insert().execute(col1=5, col2='some record')

Notice in the above two examples, no connect() method is ever called nor do we ever see a Connection anywhere; the Connection is created for you automatically via the execute() method, and a handle to the execution's cursor remains open in the returned result set. When the result set is closed via the close() method, or if the result set object falls out of scope and is garbage collected, the underlying cursor is closed, the Connection is discarded and the underlying DBAPI connection is returned to the connection pool.

The purpose of the "implicit" connection is strictly one of convenience; while in SQLAlchemy 0.1 it was the only style of operation, it is now optional.

Implicit Execution Strategies

The internal behavior of engine during implicit execution can be affected by the strategy keyword argument to create_engine(). Generally this setting can be left at its default value of plain. However, for the advanced user, the threadlocal option can provide the service of managing connections against the current thread in which they were pulled from the connection pool, where the same underlying DBAPI connection as well as a single database-level transaction can then be shared by many operations without explicitly passing a Connection or Transaction object around. It also may reduce the number of connections checked out from the connection pool at a given time.

Note that this setting does not affect the fact that Connection and Transaction objects are not threadsafe. The "threadlocal" strategy affects the selection of DBAPI connections which are pulled from the connection pool when a Connection object is created, but does not synchronize method access to the Connection or Transaction instances themselves, which are only proxy objects. It is instead intended that many Connection instances would share access to a single "connection" object that is referenced in relation to the current thread.

When strategy is set to plain, each implicit execution requests a unique connection from the connection pool, which is returned to the pool when the resulting ResultProxy falls out of scope (i.e. __del__() is called) or its close() method is called. If a second implicit execution occurs while the ResultProxy from the previous execution is still open, then a second connection is pulled from the pool.

When strategy is set to threadlocal, the Engine still checks out a connection which is closeable in the same manner via the ResultProxy, except the connection it checks out will be the same connection as one which is already checked out, assuming the operation is in the same thread. When all ResultProxy objects are closed in a particular thread, the connection is returned to the pool normally.

An additional feature of the threadlocal selection is that Transaction objects can be managed implicitly as well, by calling the begin(),commit() and rollback() methods off of the Engine, or by using Transaction objects from the thread-local connection.

It is crucial to note that the plain and threadlocal contexts do not impact the connect() method on the Engine. connect() always returns a unique connection. Implicit connections use a different method off of Engine for their operations called contextual_connect().

By default, every call to execute pulls a dedicated DBAPI connection from the connection pool:

Plain Strategy
db = create_engine('mysql://localhost/test', strategy='plain')

# execute one statement and receive results.  r1 now references a DBAPI connection resource.
r1 = db.execute("select * from table1")

# execute a second statement and receive results.  r2 now references a *second* DBAPI connection resource.
r2 = db.execute("select * from table2")
for row in r1:
    ...
for row in r2:
    ...
# release connection 1
r1.close()

# release connection 2
r2.close()

Using the "threadlocal" strategy, all calls to execute within the same thread will be guaranteed to use the same underlying DBAPI connection, which is only returned to the connection pool when all ResultProxy instances have been closed.

Threadlocal Strategy
db = create_engine('mysql://localhost/test', strategy='threadlocal')

# execute one statement and receive results.  r1 now references a DBAPI connection resource.
r1 = db.execute("select * from table1")

# execute a second statement and receive results.  r2 now references the *same* resource as r1
r2 = db.execute("select * from table2")

for row in r1:
    ...
for row in r2:
    ...
# dereference r1.  the connection is still held by r2.
r1 = None

# dereference r2.  with no more references to the underlying connection resources, they
# are returned to the pool.
r2 = None

To get at the actual Connection object which is used by implicit executions, call the contextual_connect() method on Engine:

Contextual Connection
# threadlocal strategy
db = create_engine('mysql://localhost/test', strategy='threadlocal')

conn1 = db.contextual_connect()
conn2 = db.contextual_connect()

>>> conn1.connection is conn2.connection
True

When the plain strategy is used, the contextual_connect() method is synonymous with the connect() method; both return a distinct connection from the pool.

One programming pattern that the threadlocal strategy supports is transparent connection and transaction sharing.

threadlocal connection sharing
db = create_engine('mysql://localhost/test', strategy='threadlocal')

def dosomethingimplicit():
    table1.execute("some sql")
    table1.execute("some other sql")

def dosomethingelse():
    table2.execute("some sql")
    conn = db.contextual_connect()
    # do stuff with conn
    conn.execute("some other sql")
    conn.close()

def dosomethingtransactional():
    conn = db.contextual_connect()
    trans = conn.begin()
     # do stuff
    trans.commit()

db.create_transaction()
try:
    dosomethingimplicit()
    dosomethingelse()
    dosomethingtransactional()
    db.commit()
except:
    db.rollback()

In the above example, the program calls three functions dosomethingimplicit(), dosomethingelse() and dosomethingtransactional(). In all three functions, either implicit execution is used, or an explicit Connection is used via the contextual_connect() method. This indicates that they all will share the same underlying dbapi connection as well as the same parent Transaction instance, which is created in the main body of the program via the call to db.create_transaction(). So while there are several calls that return "new" Transaction or Connection objects, in reality only one "real" connection is ever used, and there is only one transaction (i.e. one begin/commit pair) executed.

back to section top
Previous: Tutorial | Next: Database Meta Data