Google Code offered in: English - Español - 日本語 - 한국어 - Português - Pусский - 中文(简体) - 中文(繁體)
A query retrieves entities from the datastore that meet a specified set of conditions. The query specifies an entity kind, zero or more filters based on entity property values, and zero or more sort order descriptions. When the query is executed, it retrieves all entities of the given kind that satisfy all of the given filters, sorted in the specified order.
The Master/Slave Datastore and the High Replication Datastore have different guarantees when it comes to query consistency. By default:
For more information, see Setting the Read Policy and Datastore Call Deadline.
The Python datastore API provides two interfaces for preparing and executing queries: the Query interface, which uses methods to prepare the query, and the GqlQuery interface, which uses a SQL-like query language called GQL to prepare the query from a query string:
class Person(db.Model): first_name = db.StringProperty() last_name = db.StringProperty() city = db.StringProperty() birth_year = db.IntegerProperty() height = db.IntegerProperty() # The Query interface constructs a query using instance methods. q = Person.all() q.filter("last_name =", "Smith") q.filter("height <", 72) q.order("-height") # The GqlQuery interface constructs a query using a GQL query string. q = db.GqlQuery("SELECT * FROM Person " + "WHERE last_name = :1 AND height < :2 " + "ORDER BY height DESC", "Smith", 72) # The query is not executed until results are accessed. results = q.fetch(5) for p in results: print "%s %s, %d inches tall" % (p.first_name, p.last_name, p.height)
A filter includes a property name, a comparison operator, and a value. An entity satisfies the filter if it has a property of the given name and the property value compares to the value specified in the filter in the manner described by the comparison operator. The entity is a result for a query if it satisfies all of the query's filters.
The comparison operator can be any of the following:
Operator | Meaning |
---|---|
==
|
Equal to |
<
|
Less than |
<=
|
Less than or equal to |
>
|
Greater than |
>=
|
Greater than or equal to |
!=
|
Not equal to |
IN
|
Member of (equal to any of the values in a specified list) |
The not-equal (!=
) operator actually performs two queries: one in which all other filters are the same and the not-equal filter is replaced with a less-than filter, and one where it is replaced with a greater-than filter. The results are merged, in order. As described below under Restrictions on Queries, a query can have no more than one not-equal filter, and a query that has one cannot have any other inequality filters.
The IN
operator also performs multiple queries, one for each item in the specified list, with all other filters the same and the IN
filter replaced with an equality (==
) filter. The results are merged, in the order of the items in the list. If a query has more than one IN
filter, it is performed as multiple queries, one for each possible combination of values in the IN
lists.
A single query containing not-equal or IN
operators is limited to no more than 30 sub-queries.
A query can also return just the keys of the result entities instead of the entities themselves:
query = db.Query(Person, keys_only = True)
Queries with no kind can be used to find all entities in your datastore. This includes entities created and managed by other App Engine features, including statistics entities (which exist for all apps) and Blobstore metadata entities (if any). When you include a key filter, kindless queries return all entities in a specific key range. Kindless queries cannot include filters on properties.
You can filter your datastore queries to a specified ancestor, so that results contain only entities containing that ancestor. In other words, all of the results will have the ancestor as their parent, or parent's parent, or etc. Passing None as a parameter does not query for entities without ancestors and will return errors.
q = db.Query() q.ancestor(ancestor_key)
Using GQL or the Python query interface, you can perform queries for entities with a given ancestor regardless of kind. Such queries can also include filters, equality or inequality, on __key__
. Kindless queries cannot include sort orders (and thus the results are unordered) or filters on properties.
To perform a kindless ancestor query using the Query
class, call the constructor without a kind class:
q = db.Query() q.ancestor(ancestor_key) q.filter('__key__ >', last_seen_key)
To perform a kindless ancestor query using GQL (either in the Administrator Console or using the GqlQuery
class), omit the FROM Kind
clause:
q = db.GqlQuery('SELECT * WHERE ANCESTOR IS :1 AND __key__ > :2', ancestor_key, last_seen_key)
Kindless ancestor queries do not require custom indexes. You can use query_descendants()
to return a query for all descendants of the model instance.
The nature of the index query mechanism imposes a few restrictions on what a query can do. These restrictions are described in this section.
If a property has a query filter condition or sort order, the query returns only those datastore entities that have a value (including null) for that property.
Entities of a kind need not have the same properties. A filter on a property can only match an entity with a value for that property. If an entity has no value for a property used in a filter or sort order, that entity is omitted from the index built for the query.
It is not possible to query for entities that are missing a given property. One alternative is to create a fixed (modeled) property with a default value of None
, then create a filter for entities with None
as the property value.
A query may only use inequality filters (<
, <=
, >=
, >
, !=
) on one property across all of its filters.
For example, this query is allowed:
SELECT * FROM Person WHERE birth_year >= :min AND birth_year <= :max
However, the following query is not allowed, because it uses inequality filters on two different properties in the same query:
SELECT * FROM Person WHERE birth_year >= :min_year AND height >= :min_height # ERROR
Filters can combine equal (=
) comparisons for different properties in the same query, including queries with one or more inequality conditions on a property. This is allowed:
SELECT * FROM Person WHERE last_name = :last_name AND city = :city AND birth_year >= :min_year
The query mechanism relies on all results for a query to be adjacent to one another in the index table, to avoid having to scan the entire table for results. A single index table cannot represent multiple inequality filters on multiple properties while maintaining that all results are consecutive in the table.
If a query has both a filter with an inequality comparison and one or more sort orders, the query must include a sort order for the property used in the inequality, and the sort order must appear before sort orders on other properties.
This query is not valid, because it uses an inequality filter and does not order by the filtered property:
SELECT * FROM Person WHERE birth_year >= :min_year ORDER BY last_name # ERROR
Similarly, the following query is not valid because it does not order by the filtered property before ordering by other properties:
SELECT * FROM Person WHERE birth_year >= :min_year ORDER BY last_name, birth_year # ERROR
This query is valid:
SELECT * FROM Person WHERE birth_year >= :min_year ORDER BY birth_year, last_name
To get all results that match an inequality filter, a query scans the index table for the first matching row, then returns all consecutive results until it finds a row that doesn't match. For the consecutive rows to represent the complete result set, the rows must be ordered by the inequality filter before other sort orders.
Due to the way they're indexed, properties with multiple values interact with query filters and sort orders in specific and sometimes surprising ways.
First, if a query has multiple inequality filters on a given property, an entity matches the query only if it has an individual value for that property that matches all of the inequality filters. For example, if an entity has the values [1, 2]
for property x
, it will not match the query WHERE x > 1 AND x < 2
. Each filter does match one of x
's values, but no single value matches both filters.
(Note that this does not apply to =
filters. For example, the
query WHERE x = 1 AND x = 2
will return the above entity.)
The !=
operator works as "value is other than" test. So, for example, the filter x != 1
matches an entity with value greater than or less than 1.
Similarly, the sort order for multiply valued properties is unusual:
This ordering takes place because multivalued properties appear in an index once for each unique value they contain, but the datastore removes duplicate values from the index, so the first value seen in the index determines its sort order.
This sort order has the unusual consequence that [1, 9]
comes before [4, 5, 6, 7]
in both ascending and descending order.
Queries can't find property values that aren't indexed. (You also cannot sort on unindexed properties.) For a detailed discussion, see Unindexed Properties.
One important caveat is queries with both an equality filter and a sort order on a multi-valued property. In those queries, the sort order is disregarded. For single-valued properties, this is a simple optimization. Every result would have the same value for the property, so the results do not need to be sorted further.
However, multi-valued properties may have additional values. Since the sort order is disregarded, the query results may be returned in a different order than if the sort order were applied. (Restoring the dropped sort order would be expensive and require extra indices, and this use case is rare, so the query planner leaves it off.)
When a query does not specify an order, the datastore returns the results in the order of retrieval. As we make changes to the datastore implementation, the order of results in an unordered query may change as well. Therefore, if you require a specific sort ordering for your query results, be sure to specify it in the query. If you don't specify a sort order, the ordering of results is undefined and may change over time.
Queries are only supported inside transactions if they include an ancestor filter. The query's ancestor must be in the same entity group as the other operations in the transaction. This preserves the restriction that a transaction can only operate on entities in a single entity group.
After constructing your query, you can specify a number of fetch options to control which results get returned for your query.
If you want to return only a single entity matching your query, you can use the
query method get()
.
This will return the first result that matches the query.
Key-only queries return just the keys to the entities that match your query from the datastore. Key-only queries run faster than queries that return complete entities. To return only the keys,
set keys_only=True
when constructing the query object.
You can specify a limit and offset with your query to control the number and range of results returned in one batch. Specifying an integer limit returns up to that number of results that match the query. Using an integer offset will skip that number of results and return the rest, up to the limit specified. For example, to fetch the tallest five people from your datastore, you would construct your query as:
# Prepare a query. q = Person.all() q.order("-height") # The query is not executed until results are accessed. results = q.fetch(5) for p in results: print "%s %s, %d inches tall" % (p.first_name, p.last_name, p.height)
If you wanted the 6th through 10th tallest people, you would instead use:
# Prepare a query. # The query is not executed until results are accessed. results = q.fetch(limit=5, offset=5)
When iterating through the results of your query, the datastore fetches the results in batches. By default, each batch contains 20 results, and you cannot change this value. You can continue iterating through query results until all are returned or the request times out.
Note: For an in-depth discussion of indexes and querying, see the article Index Selection and Advanced Search.
Every datastore query uses an index, a table containing the results for the query in the desired order. An App Engine application defines its indexes in a configuration file named index.yaml
. The development web server automatically adds suggestions to this file as it encounters queries that do not yet have indexes configured. You can tune indexes manually by editing the file before uploading the application.
Note: You can get the list of indexes for your application at runtime by calling the methods get_indexes() or get_indexes_async() .
The index-based query mechanism supports most common kinds of queries, but it does not support some queries common in other database technologies. Restrictions on queries, and their explanations, are described below.
The datastore maintains an index for every query an application intends to make. As the application makes changes to datastore entities, the datastore updates the indexes with the correct results. When the application executes a query, the datastore fetches the results directly from the corresponding index.
The datastore executes a query using the following steps:
An index table contains columns for every property used in a filter or sort order. The rows are sorted by the following aspects, in order:
This puts all results for every possible query that uses this index in consecutive rows in the table.
An index contains an entity only if the index refers to every property in the entity. If the index does not reference a property of the entity, that entity will not appear in the index, and will never be a result for the query that uses the index.
Note that the App Engine datastore makes a distinction between an entity that does not possess a property and an entity that possesses a property with a value of None
. If you want every entity of a kind to be a potential result for a query, you can use a data model that assigns a default value (such as None
) to properties used by query filters.
In some cases, you will never have to filter or sort on a particular property. When this happens, you can make the property an unindexed property. This type of property includes those that are explicitly marked as unindexed, as well as those with values of the long text value type ( Text) or the long binary value type ( Blob ). Using unindexed properties allows you to reduce the cost of your application by decreasing the number of datastore writes needed, because the datastore will not need to maintain index entries for the property.
A query with a filter or sort order on a property will never match an entity with that property unindexed. Properties with such values behave as if the property is not set with regard to query filters and sort orders.
To specify that a property not be indexed, you modify the property definition by setting indexed=False
in the Property constructor.
class Person(db.Model): name = db.StringProperty() age = db.IntegerProperty(indexed=False) # Raises PropertyError because 'age' is not indexed. db.GqlQuery("SELECT * FROM Person WHERE age > :1", 18)
If you have existing records created with an unindexed property, that property continues to be unindexed for those records even after you change the entity (class) definition to make the property indexed again. Consequently, those records will not be returned by queries filtering on that property. To change this, you must rewrite those records, that is, do a fetch then a put on each one. The act of rewriting creates new index rows for those formerly unindexed properties. You can use the Mapreduce API to simplify this work.
To make a formerly unindexed property be indexed
indexed=True
in the Property constructor:
class Person(db.Model): name = db.StringProperty() age = db.IntegerProperty(indexed=True)
If you change a property from indexed to unindexed, the index records for all existing entities with that property will continue to exist until you update (or delete) the entity associated with those index records. Any new entities are created without the index records for the now unindexed property. So you need to purge your code of all queries that filter or sort by the property that is now unindexed.
When two entities have properties of the same name but of different value types, an index of the property sorts the entities first by value type, then by an order appropriate to the type. For example, if two entities each have a property named "age," one with an integer value and one with a string value, the entity with the integer value always appears before the entity with the string value when sorted by the "Age" property, regardless of the values themselves.
This is especially worth noting in the case of integers and floating point numbers, which are treated as separate types by the datastore. A property with the integer value 38
is sorted before a property with the floating point value 37.5
, because all integers are sorted before floats.
App Engine builds indexes for several simple queries by default. For other queries, the application must specify the indexes it needs in a configuration file named index.yaml
. If the application running under App Engine tries to perform a query for which there is no corresponding index (either provided by default or described in index.yaml
), the query will fail with a NeedIndexError exception.
App Engine suggests automatic indexes for the following forms of queries:
Other forms of queries require their indexes to be specified in index.yaml
, including:
Note: The App Engine SDK suggests indexes that are appropriate for most applications. Depending on your application's use of the datastore and the size and shape of your application's data, manual adjustments to your indexes may be warranted. For example, writing entities with multiple property values may result in an exploding index with high write costs. For more information, see Big Entities and Exploding Indexes.
The development web server makes managing index configuration easy: Instead of failing to execute a query that does not have an index and requires it, the development web server can generate configuration for an index that would allow the query to succeed. If your local testing of an application calls every possible query the application will make, using every combination of filter and sort order, the generated entries will represent a complete set of indexes. If your testing does not exercise every possible query form, you can review and adjust the index configuration before uploading the application.
Note: One way to avoid big entities and exploding indexes is described in detail in the article Index Selection and Advanced Search.
As described above, every property (that doesn't have a Text
or Blob
value) of every entity is added to at least one index table, including a simple index provided by default, and any indexes described in the application's index.yaml
file that refer to the property. For an entity that has one value for each property, App Engine stores a property value once in its simple index, and once for each time the property is referred to in a custom index. Each of these index entries must be updated every time the value of the property changes.
The datastore limits the number of index entries that a single entity can have. The index entry limit is large, and most applications are not affected. However, there are some circumstances where you might encounter the limit. For example, an entity with a large number of single-value properties can exceed the index entry limit.
Properties with multiple values store each value as a separate entry in an index. An entity with a single property with a large number of values can exceed the index entry limit.
Custom indexes that refer to multiple properties with multiple values can get very large with only a few values. To completely record such properties, the index table must include a row for every permutation of the values of every property for the index.
For example, consider the following query:
SELECT * FROM MyModel WHERE x=1 AND y=2 ORDER BY date
This query causes the SDK to suggest the following index:
indexes: - kind: MyModel properties: - name: x - name: y - name: date
The following code creates an entity with four values for the property x
, three values for the property y
, and sets the date
to the current date:
class MyModel(db.Expando): pass e2 = MyModel() e2.x = [1, 2, 3, 4] e2.y = ['red', 'green', 'blue'] e2.date = datetime.datetime.now() e2.put()
To accurately represent multiple properties with multiple values, the index must store a value for each permutation of values contained in the properties x
, y
, and date
, as modeled by the following equation:
index_value_count = |x| * |y| * |date|
These indexes are called "exploding indexes" because they can become very large with just a few values. Exploding indexes cause entity writes to cost more because you're writing more data. They can also very easily cause an entity to exceed the per-entity index value limit.
For example, the code snippet above generates the following values for x
, y
, and date
:
x = [1, 2, 3, 4] y = ['red', 'green', 'blue'] date = <now>
This index stores the following permutations:
x value |
y value |
date value |
---|---|---|
1 |
'blue' |
<now> |
1 |
'green' |
<now> |
1 |
'red' |
<now> |
2 |
|
<now> |
2 |
'green' |
<now> |
2 |
'red' |
<now> |
3 |
|
<now> |
3 |
'green' |
<now> |
3 |
'red' |
<now> |
4 |
|
<now> |
4 |
'green' |
<now> |
4 |
'red' |
<now> |
Total: 12 values saved to the datastore. |
You can avoid exploding indexes by defining these indexes instead in
index.yaml
:
indexes: - kind: MyModel properties: - name: x - name: date - kind: MyModel properties: - name: y - name: date
With the index above, the number of values stored for the same query is additive:
index_value_count = |x| * |date| + |y| * |date|
x value |
date value |
y value |
date value |
---|---|---|---|
1 |
<now> |
'red' |
<now> |
2 |
<now> |
'green' |
<now> |
3 |
<now> |
'blue' |
<now> |
4 |
<now> |
||
Total: 7 values saved to the datastore. |
The App Engine SDK can detect exploding indexes only when the same property is repeated multiple times. In this case, the SDK suggests an alternative index. However, in all other circumstances (such as indexes generated for queries like the one in this example), the SDK generates an exploding index. In this circumstance, you can manually configure an index in index.yaml
to circumvent the exploding index.
If a put()
would result in a number of index entries that exceeds the limit, the call fails with an exception. If you create a new index that would contain a number of index entries that exceeds the limit for any entity when built, queries against the index fail, and the index appears in the "Error" state in the Admin Console.
To handle "Error" indexes, first remove them from your index.yaml
file and run appcfg.py vacuum_indexes
. Then, either reformulate the index definition and corresponding queries or remove the entities that are causing the index to "explode." Finally, add the index back to index.yaml
and run appcfg.py update_indexes
.
You can avoid exploding indexes by avoiding queries that would require a custom index using a list property. As described above, this includes queries with multiple sort orders or queries with a mix of equality and inequality filters.
An application has an index for each combination of kind, filter property, and operator, as well as sort order used in a query. For example, see the following query, stated in GQL:
SELECT * FROM Person WHERE last_name = "Smith" AND height < 72 ORDER BY height DESC
The index for this query is a table of keys for entities of the kind Person
, with columns for the values of the height
and last_name
properties. The index is sorted by height
in descending order.
Two queries of the same form but with different filter values use the same index. For example, the following query uses the same index as the query above:
SELECT * FROM Person WHERE last_name = "Jones" AND height < 63 ORDER BY height DESC
The following two queries use the same index as well, despite their different forms:
SELECT * FROM Person WHERE last_name = "Friedkin" AND first_name = "Damian" ORDER BY height DESC
SELECT * FROM Person WHERE last_name = "Blair" ORDER BY first_name, height DESC
Keys are ordered first by parent path, then by kind, then by key name or ID. Kinds and key names are strings and are ordered by byte value. IDs are integers and are ordered numerically. If entities of the same parent and kind use a mix of key name strings and numeric IDs, entities with numeric IDs are considered to be less than entities with key name strings. Elements of the parent path are compared similarly: by kind (string), then by key name (string), or ID (number).
Queries involving keys use indexes just like queries involving properties. Queries on keys require custom indexes in the same cases as with properties, with a couple of exceptions: inequality filters or an ascending sort order on __key__
do not require a custom index, but a descending sort order on __key__
does. As with all queries, the development server creates appropriate configuration entries in this file when a query that needs a custom index is tested.
Query cursors allow an app to perform a query and retrieve a batch of results, then fetch additional results for the same query in a subsequent web request without the overhead of a query offset. After the app fetches some results for a query, it can ask for an encoded string that represents the location in the result set after the last result fetched (the "cursor"). The app can use the cursor to fetch additional results starting from that point at a later time.
A cursor is an opaque base64-encoded string that represents the next starting position of a query after a fetch operation. The app can embed the cursor in web pages as HTTP GET or POST parameters, or it can store the cursor in the datastore or memcache, or in a task queue task payload. A future request handler can perform the same query and include the cursor with the query to tell the datastore to start returning results from the location represented by the cursor. A cursor can only be used by the app that performed the original query, and can only be used to continue the same query.
Tip: It is generally safe to pass a datastore cursor to a client, such as in a web form, and accept a cursor value from a client. A client cannot change the cursor value to access results outside of the original query. However, the base64-encoded value can be decoded to expose information about result entities, such as the key (app ID, kind, key name or ID, and all ancestor keys) and properties used in the query (including filters and sort orders). If you don't want users to have access to that information, you can encrypt the cursor, or store it and provide the user with an opaque key.
The cursor's position is defined as the location in the result list after the last result returned. A cursor is not a relative position in the list (it's not an offset); it's a marker to which the datastore can jump when starting an index scan for results. If the results for a query change between uses of a cursor, the query notices only changes that occur in results after the cursor. If a new result appears before the cursor's position for the query, it will not be returned when the results after the cursor are fetched. Similarly, if an entity is no longer a result for a query but had appeared before the cursor, the results that appear after the cursor do not change. If the last result returned is removed from the result set, the cursor still knows how to locate the next result.
An interesting application of cursors is to monitor entities for unseen changes. If the app sets a timestamp property with the current date and time every time an entity changes, the app can use a query sorted by the timestamp property, ascending, with a datastore cursor to check when entities are moved to the end of the result list. If an entity's timestamp is updated, the query with the cursor returns the updated entity. If no entities were updated since the last time the query was performed, no results are returned, and the cursor does not move.
When retrieving query results, you can use both a start cursor and an end cursor to return a continuous group of results from the datastore. When using a start and end cursor to retrieve the results, you are not guaranteed that the size of the results will be the same as when you generated the cursors. Entities may be added or deleted from the datastore between the time the cursors are generated and when they are used in a query.
In Python, the app gets the cursor after fetching results by calling the cursor()
method of the Query
object. To fetch additional results, the app prepares a similar query (with the same filters, sort orders and ancestors), and calls the with_cursor()
method with the cursor before executing the query.
from google.appengine.api import memcache from google.appengine.ext import db # class Person(db.Model): ... # Start a query for all Person entities. people = Person.all() # If the app stored cursors during a previous request, use them. start_cursor = memcache.get('person_start_cursor') end_cursor = memcache.get('person_end_cursor') if start_cursor: people.with_cursor(start_cursor=start_cursor) if end_cursor: people.with_cursor(end_cursor=end_cursor) # Iterate over the results. for person in people: # Do something
Note that because of how the iterator interface fetches results in batches, getting a cursor may result in an additional call to the datastore to position the cursor where the iterator left off. If using only a start cursor, and if you know how many results you need ahead of time, it's faster to use fetch()
.
A few things to know about cursors:
IN
or !=
filter operators.datastore_errors.BadRequestError
.In order to increase data availability, you can set the datastore read policy so that all reads and queries are eventually consistent. While the API also allows you to explicitly set a strong consistency policy, in the High Replication Datastore non-ancestor queries are always eventually consistent. Setting the read policy to strong consistency for a non-ancestor query in the High Replication Datastore will have no effect.
When you select eventual consistency for a datastore query, the indexes used by the query to gather results are also accessed with eventual consistency. Eventual consistency queries occasionally return entities that don't match the query criteria, while strong consistency queries are always transactionally consistent. (You can use transactions to ensure a consistent result set if the query uses an ancestor filter.) See Transaction Isolation in App Engine for more information on how entities and indexes are updated.
You can set the read policy (strong consistency vs. eventual consistency) and the datastore call deadline when you execute a query. To do this, create a configuration object with these options set, then pass the object to the method that performs the query.
To create the configuration object, you call the function create_config() in the google.appengine.ext.db
module. The read_policy
argument specifies the read policy, as either db.EVENTUAL_CONSISTENCY
or db.STRONG_CONSISTENCY
. (The default is strong consistency.) The deadline
argument specifies the datastore call deadline, as a number of seconds.
config = db.create_config(deadline=5, read_policy=db.EVENTUAL_CONSISTENCY)
To use this configuration with the fetch() method, pass the object to the method as the config
argument:
config = db.create_config(deadline=5, read_policy=db.EVENTUAL_CONSISTENCY) results = Employee.all().fetch(10, config=config)
To use this configuration with the iterator interface, call the run() method with this object as the config
argument to get the iterable:
config = db.create_config(deadline=5, read_policy=db.EVENTUAL_CONSISTENCY) for result in Kind.all().run(config=config): # ...
The get() and count() methods of Query GqlQuery also support the config
argument in this way.
A configuration object can be used any number of times.