NDB is an experimental, innovative, and rapidly changing new feature for App Engine. Unfortunately, being on the bleeding edge means that we may make backwards-incompatible changes to NDB. We will inform the community when this feature is no longer experimental.
An application can use queries to search the Datastore for entities that match specific search criteria called filters.
An application can use queries to search the Datastore for entities that match specific search criteria called filters. For example, an application that keeps track of several guestbooks could use a query to retrieve messages from one guestbook, ordered by date:
from google.appengine.ext import ndb class Greeting(ndb.Model): """Models an individual Guestbook entry with content and date.""" content = ndb.StringProperty() date = ndb.DateTimeProperty(auto_now_add=True) @classmethod def query_book(cls, ancestor_key): return cls.query(ancestor=ancestor_key).order(-cls.date) class MainPage(webapp.RequestHandler): def get(self): guestbook_name = self.request.get('guestbook_name') ancestor_key = ndb.Key("Book", guestbook_name or "*notitle*") greetings = Greeting.query_book(ancestor_key).fetch(20) self.response.out.write('<html><body>') for greeting in greetings: self.response.out.write('<blockquote>%s</blockquote>' % cgi.escape(greeting.content))
The App Engine Datastore natively supports filters for
exact matches (the == operator) and
comparisons (the <, <=, > and >= operators).
It supports combining multiple filters using a Boolean
AND
operator, with some limitations (see below).
In addition to the native operators,
the API supports the !=
operator,
combining groups of filters using the Boolean OR
operator,
and the IN
operator,
which test for equality to one of a list of possible values
(like Python's 'in' operator).
These operations don't map 1:1 to the Datastore's
native operators; thus they are
a little quirky and slow, relatively.
They are implemented using in-memory merging of
result streams. Note that p != v
is
implemented as "p < v OR p > v".
(This matters for
repeated properties.)
Limitations: The Datastore enforces some restrictions on queries. Violating these will cause it to raise exceptions. For example, combining too many filters, using inequalities for multiple properties, or combining an inequality with a sort order on a different property are all currently disallowed. Also filters referencing multiple properties sometimes require secondary indexes to be configured.
Unsupported: The Datastore does not directly support substring matches, case-insensitive matches, or so-called full-text search. There are ways to implement case-insensitive matches and even full-text search using computed properties.
Usually you don't want to retrieve all entities of a given kind; you want only those with a specific value or range of values for some property.
Property objects overload some operators to return filter expressions that can be used to control a query: for example, to find all Account entities whose userid property has the exact value 42, you can use the expression
qry = Account.query(Account.userid == 42)
NDB supports these operators:
property == value
property < value
property <= value
property > value
property >= value
property != value
property.IN([value1, value2])
To filter for an inequality, you can use syntax like the following:
qry = Account.query(Account.userid >= 40)
This finds all Account entities whose userid
property is
greater than or equal to 40.
Two of these operators, != and IN, are implemented as combinations of the others, and are a little quirky as described in != and IN.
You can specify multiple filters:
qry = Account.query(Account.userid >= 40, Account.userid < 50)
This combines the specified filter arguments, returning all Account entities whose userid value is greater than or equal to 40 and less than 50.
Note: As mentioned earlier, the Datastore rejects queries using inequality filtering on more than one property.
Instead of specifying an entire query filter in a single expression, you may find it more convenient to build it up in steps: for example:
qry1 = Account.query() # Retrieve all Account entitites qry2 = qry1.filter(Account.userid >= 40) # Filter on userid >= 40 qry3 = qry2.filter(Account.userid < 50) # Filter on userid < 50 as well
qry3
is equivalent to the qry
variable from the
previous example. Note that query objects are immutable, so the
construction of qry2
does not affect qry1
and the construction of qry3
does not affect
qry1
or qry2
.
The !=
(not-equal) and IN
(membership)
operators are implemented by combining other filters using the
OR
operator. The first of these,
property != value
is implemented as
(property < value) OR (property > value)
For example,
qry = Article.query(Article.tags != 'perl')
is equivalent to
qry = Article.query(query.OR(Article.tags < 'perl', Article.tags > 'perl'))
Note:
Perhaps surprisingly, this query does not search for
Article
entities that don't include 'perl' as a tag!
Rather, it finds all entities with at least one tag unequal to 'perl'.
For example, the following entity would be included in the results,
even though it has 'perl' as one of its tags:
Article(title='Perl + Python = Parrot',
However, this one would not be included:
stars=5,
tags=['python', 'perl'])
Article(title='Introduction to Perl',
stars=3,
tags=['perl'])
There is no way to query for entities that do not
include a tag equal to 'perl'.
Similarly, the IN operator
property IN [value1, value2, ...]
which tests for membership in a list of possible values, is implemented as
(property == value1) OR (property == value2) OR ...
For example,
qry = Article.query(Article.tags.IN(['python', 'ruby', 'php']))
is equivalent to
qry = Article.query(query.OR(Article.tags == 'python', Article.tags == 'ruby', Article.tags == 'php'))
Note:
Queries using OR
de-duplicate their results: the result stream doesn't
include entity more than once, even if an entity matches two or more subqueries.
The Article
class defined in the preceding section also
serves as an example of querying for repeated properties. Notably, a filter
like
Article.tags == 'python'
uses a single value, even though Article.tags
is a
repeated property. You cannot compare repeated properties to list
objects (the Datastore won't understand it), and a filter like
Article.tags.IN(['python', 'ruby', 'php'])
does something completely different from searching for
Article
entities whose tags value is the list
['python', 'ruby', 'php']
:
it searches for entities whose tags
value (regarded as a list)
contains at least one of those values.
Querying for a value of None
on a repeated property has
undefined behavior; don't do that.
You can nest AND
and OR
operators arbitrarily:
for example,
qry = Article.query(query.AND(Article.tags == 'python', query.OR(Article.tags.IN(['ruby', 'jruby']), query.AND(Article.tags == 'php', Article.tags != 'perl'))))
Due to OR
's implementation, a query of this form that is
too complex might fail with an exception.
These filters are normalized so that there is (at most)
a single OR
operator at the top of the expression tree and a
single
level of AND operators right below that. This expansion uses the standard
rules for obtaining a disjunctive normal form for a Boolean expression,
along with the expansions already given for !=
and
IN
. Altogether, the normalized form of the example above is
(using an informal notation):
OR(AND(tags == 'python', tags == 'ruby'), AND(tags == 'python', tags == 'jruby'), AND(tags == 'python', tags == 'php', tags < 'perl'), AND(tags == 'python', tags == 'php', tags > 'perl'))
Caution:
For some filters, this normalization can cause a combinatorial
explosion. Consider the AND
of 3 OR
clauses with 2 basic clauses each.
When normalized, this becomes an OR
of 8 AND
clauses with 3 basic clauses each: that is, 6 terms become 24.
You can use the order()
method to specify the order in
which a query returns its results. This method takes a list of
arguments, each of which is either a property object (to be sorted in
ascending order) or its negation (denoting descending order). For example:
qry = Greeting.query().order(Greeting.message, -Greeting.userid)
This retrieves all Greeting
entities, sorted by
ascending value of their message
property.
Runs of consecutive entities with the same message property will be
sorted by descending value of their userid
property.
You can use multiple order()
calls to the same effect:
qry = Greeting.query().order(Greeting.message).order(-Greeting.userid)
Note: When combining filters
with order()
, the Datastore rejects certain combinations.
In particular, when you use an inequality filter, the first sort order
(if any) must specify the same property as the filter.
Also, you sometimes need to configure a secondary index.
Ancestor queries constrain query results to entities descended from a specific ancestor, identified by its key:
key = ndb.Key(BlogPost, 12345) qry = Comment.query(ancestor=key)
You can combine this with filters and sort orders:
qry = Comment.query(Comment.tags == 'python', ancestor=key).order(Comment.date)
This is particularly useful inside transactions, which are limited to operating on entities descended from a common ancestor (or, for cross-group transactions, a few common ancestors).
Query objects have the following read-only data attributes:
Attribute | Type | Default | Description |
---|---|---|---|
kind | str | None | Kind name (usually the class name) |
ancestor | Key | None | Ancestor specified to query |
filters | FilterNode | None | Filter expression |
orders | Order | None | Sort orders |
Printing a query object (or calling str()
or
repr()
on it)
produces a nicely-formatted string representation:
print Employee.query() Query(kind='Employee') print Employee.query(ancestor=Key(Manager, 1)) Query(kind='Employee', ancestor=Key('Manager', 1))
A query can filter directly for the field values of structured properties.
For example, a query for all contacts with an address whose city is
'Amsterdam'
would look like
Contact.query(Contact.address.city == 'Amsterdam')
If you combine multiple such filters, the filters may match
different Address
sub-entities within
the same Contact entity.
For example:
Contact.query(Contact.address.city == 'Amsterdam', # Beware! Contact.address.street == 'Spear St')
may find contacts with an address whose city is
'Amsterdam'
and another (different) address whose street is
'Spear St'
. However, at least for equality filters, you can
create a query that returns only results with multiple values in a
single sub-entity:
Contact.query(Contact.address == Address(city='San Francisco', street='Spear St'))
If a sub-entity has any property values equal to None
,
they are ignored. Thus, it doesn't make sense to filter for
a sub-entity property value of None
.
While a query is in progress, its state is held in an
iterator object. (Most applications won't use them directly; it's
normally more straightforward to call fetch(20)
than
to manipulate the iterator object.)
There are two basic ways to get such an object:
iter()
function on a
Query
object
Query
object's iter()
method
The first supports the use of a Python for
loop
(which implicitly calls the iter()
function)
to loop over a query.
for greeting in greetings: self.response.out.write('<blockquote>%s</blockquote>' % cgi.escape(greeting.content))
The second way, using the Query
object's
iter()
method, allows you to pass options to the
iterator to affect its behavior. For example, to use a
keys-only query in a for
loop, you can write this:
for key in qry.iter(keys_only=True): print key
Query iterators have other useful methods:
Method | Description |
---|---|
__iter__()
| Part of Python's iterator protocol. |
next()
| Returns the next result
or raises the exception StopIteration if there is none. |
has_next()
| Returns True if a subsequent next()
call will return a result, False if it will raise
StopIteration .Blocks until the answer to this question is known and buffers the result (if any) until you retrieve it with next() .
|
probably_has_next()
| Like has_next() , but uses a faster
(and sometimes inaccurate) shortcut.May return a false positive ( True when
next() would actually raise StopIteration ),
but never a false negative (False when
next() would actually return a result).
|
cursor_before()
| Returns a query cursor representing a point just before the
last result returned. Raises an exception if no cursor is available (in particular, if the produce_cursors query option was not passed).
|
cursor_after()
| Returns a query cursor representing a point just after the last
result returned. Raises an exception if no cursor is available (in particular, if the produce_cursors query option was not passed).
|
A query cursor is a small opaque data structure representing a resumption point in a query. Using cursors takes several steps:
produce_cursors
set.
start_cursor
or end_cursor
option, to resume the query from or restrict it to this point.
Here is an example:
from google.appengine.datastore.datastore_query import Cursor q = Person.query() it = q.iter(produce_cursors=True) p1 = it.next() print p1.key p2 = it.next() print p2.key p3 = it.next() print p3.key c = it.cursor_after() # The position right after p3 cstr = c.to_websafe_string() print cstr # Start a new query that continues where the first one stopped c2 = Cursor.from_websafe_string(cstr) it2 = q.iter(start_cursor=c2) p4 = it2.next() print p4.key p5 = it2.next() print p5.key
Note the use of to_websafe_string()
and
from_websafe_string()
to serialize and deserialize the cursor.
This allows you to pass a cursor to a client on the web in the
response to one request, and receive it back from the client in a later
request.
Occasionally, calling cursor_after()
or
cursor_before()
may make a blocking Datastore call, rerunning
part of the query in order to extract a cursor that points to the middle of
a batch.
The Query
method fetch_page()
simplifies the process by doing most of the work for you.
It works somewhat like fetch()
, but it returns a triple
(results, cursor, more)
.
The returned more
flag indicates whether there are more
results; a UI can use this, for example, to suppress a
"Next Page" button or link.
To request subsequent pages, pass the cursor returned by one
fetch_page()
call into the next.
The example above simplifies to:
from google.appengine.datastore.datastore_query import Cursor q = Person.query() res, c, more = q.fetch_page(3) print res