English

Google App Engine

Entities, Properties, and Keys

Overview

Objects in the App Engine Datastore are known as entities. An entity has one or more named properties, each of which can have one or more values. Entities of the same kind need not have the same properties, and an entity's values for a given property need not all be of the same data type. (If appropriate, an application can establish and enforce such restrictions in its own data model.)

The Datastore supports a variety of data types for property values. These include, among others:

  • Integers
  • Floating-point numbers
  • Strings
  • Dates
  • Binary data

Each entity in the Datastore has a key that uniquely identifies it. The key consists of the following components:

  • The kind of the entity, which categorizes it for the purpose of Datastore queries
  • An identifier for the individual entity, which can be either
    • a key name string
    • an integer numeric ID
  • An optional ancestor path locating the entity within the Datastore hierarchy

An application has access only to entities it has created itself; it can't access data belonging to other applications. It can fetch an individual entity from the Datastore using the entity's key, or it can retrieve one or more entities by issuing a query based on the entities' keys or property values.

The Java App Engine SDK includes a simple Java API, provided in the package com.google.appengine.api.datastore, that supports the features of the Datastore directly. All of the examples in this document are based on this low-level Java API; you can choose to use it either directly in your application or as a basis on which to build your own data management layer. The Datastore itself does not enforce any restrictions on the structure of entities, such as whether a given property has a value of a particular type; this task is left to the application.

Note: In addition to the low-level Java API, the SDK also supports two standard Java interfaces for data storage, Java Data Objects (JDO) and the Java Persistence API (JPA), which you can use to manage and enforce the structure of entities. These interfaces let you model your data objects as Java classes, making it easier to port your application between the App Engine Datastore and other data storage solutions.

Kinds and Identifiers

Each Datastore entity is of a particular kind, which categorizes the entity for the purpose of queries: for instance, a human resources application might represent each employee at a company with an entity of kind Employee. In the Java Datastore API, you specify an entity's kind when you create it, as an argument to the Entity() constructor. The following example creates an entity of kind Employee, populates its property values, and saves it to the Datastore:

import java.util.Date;
import com.google.appengine.api.datastore.DatastoreService;
import com.google.appengine.api.datastore.DatastoreServiceFactory;
import com.google.appengine.api.datastore.Entity;


DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();


Entity employee = new Entity("Employee");

employee.setProperty("firstName", "Antonio");
employee.setProperty("lastName", "Salieri");

Date hireDate = new Date();
employee.setProperty("hireDate", hireDate);

employee.setProperty("attendedHrTraining", true);


datastore.put(employee);

In addition to a kind, each entity has an identifier, assigned when the entity is created. Because the identifier is part of the entity's key, it is associated permanently with the entity and cannot be changed. It can be assigned in either of two ways:

  • Your application can specify its own identifier string for the entity (called the key name).
  • You can have the Datastore automatically assign the entity an integer numeric ID.

Note: Instead of using key name strings or generating numeric IDs automatically, advanced applications may sometimes wish to assign their own numeric IDs manually to the entities they create. Be aware, however, that there is nothing to prevent the Datastore from assigning one of your manual numeric IDs to another entity. The only way to avoid such conflicts is to have your application obtain a block of IDs with the methods DatastoreService.allocateIds() or AsyncDatastoreService.allocateIds(). The Datastore's automatic ID generator will keep track of IDs that have been allocated with these methods and will avoid reusing them for another entity, so you can safely use such IDs without conflict.

To assign an entity a key name, provide the name as the second argument to the constructor when you create the entity:

Entity employee = new Entity("Employee", "asalieri");

To have the Datastore assign a numeric ID automatically, omit this argument:

Entity employee = new Entity("Employee");

Ancestor Paths

Entities in the Datastore form a hierarchically structured space similar to the directory structure of a file system. When you create an entity, you can optionally designate another entity as its parent; the new entity is a child of the parent entity. This association between an entity and its parent is permanent, and cannot be changed once the entity is created. An entity without a parent is a root entity. The Datastore will never assign the same numeric ID to two entities with the same parent, or to two root entities (those without a parent).

To designate an entity's parent, provide the parent entity's key as an argument to the Entity() constructor when creating the child entity. You can get the key by calling the parent entity's getKey() method:

Entity employee = new Entity("Employee");
datastore.put(employee);

Entity address = new Entity("Address", employee.getKey());
datastore.put(address);

If the new entity also has a key name, provide the key name as the second argument to the Entity() constructor and the key of the parent entity as the third argument:

Entity address = new Entity("Address", "addr1", employee.getKey());

An entity's parent, parent's parent, and so on recursively, are its ancestors; its children, children's children, and so on, are its descendants. The sequence of entities beginning with a root entity and proceeding from parent to child, leading to a given entity, constitute that entity's ancestor path. The complete key identifying the entity consists of a sequence of kind-identifier pairs specifying its ancestor path and terminating with those of the entity itself:

Person:GreatGrandpa / Person:Grandpa / Person:Dad / Person:Me

For a root entity, the ancestor path is empty and the key consists solely of the entity's own kind and identifier:

Person:GreatGrandpa

Transactions and Entity Groups

Every attempt to create, update, or delete an entity takes place in the context of a transaction. A single transaction can include any number of such operations. To maintain the consistency of the data, the transaction ensures that all of the operations it contains are applied to the Datastore as a unit or, if any of the operations fails, that none of them are applied.

Note: If your application receives an exception when submitting a transaction, it does not necessarily mean that the transaction has failed. It is possible to receive a DatastoreTimeoutException, ConcurrentModificationException, or DatastoreFailureException even when a transaction has been committed and will eventually be applied successfully. Whenever possible, structure your Datastore transactions so that the end result will be unaffected if the same transaction is applied more than once.

A single transaction can apply to multiple entities, so long as the entities are descended from a common ancestor. Such entities are said to belong to the same entity group. In designing your data model, you should determine which entities you need to be able to process in the same transaction. Then, when you create those entities, place them in the same entity group by declaring them with a common ancestor. This tells App Engine that the entities will be updated together, so it can store them in a way that supports transactions.

Properties and Value Types

The data values associated with an entity consist of one or more properties. Each property has a name and one or more values. A property can have values of more than one type, and two entities can have values of different types for the same property.

Tip: Properties with multiple values can be useful, for instance, when performing queries with equality filters: an entity satisfies the query if any of its values for a property matches the value specified in the filter. For more details on multiple-valued properties, including issues you should be aware of, see the Queries and Indexes page.

The following value types are supported:

Value type Java type(s) Sort order Notes
Integer short
int
long
java.lang.Short
java.lang.Integer
java.lang.Long
Numeric Stored as long integer, then converted to the field type

Out-of-range values overflow
Floating-point number float
double
java.lang.Float
java.lang.Double
Numeric 64-bit double precision,
IEEE 754
Boolean boolean
java.lang.Boolean
false < true
Text string (short) java.lang.String Unicode Up to 500 Unicode characters

Values longer than 500 characters throw IllegalArgumentException
Text string (long) com.google.appengine.api.datastore.Text None Up to 1 megabyte

Not indexed
Byte string (short) com.google.appengine.api.datastore.ShortBlob Byte order Up to 500 bytes

Values longer than 500 bytes throw IllegalArgumentException
Byte string (long) com.google.appengine.api.datastore.Blob None Up to 1 megabyte

Not indexed
Date and time java.util.Date Chronological
Geographical point com.google.appengine.api.datastore.GeoPt By latitude,
then longitude
Postal address com.google.appengine.api.datastore.PostalAddress Unicode
Telephone number com.google.appengine.api.datastore.PhoneNumber Unicode
Email address com.google.appengine.api.datastore.Email Unicode
Google Accounts user com.google.appengine.api.users.User Email address
in Unicode order
Instant messaging handle com.google.appengine.api.datastore.IMHandle Unicode
Link com.google.appengine.api.datastore.Link Unicode
Category com.google.appengine.api.datastore.Category Unicode
Rating com.google.appengine.api.datastore.Rating Numeric
Datastore key com.google.appengine.api.datastore.Key
or the referenced object (as a child)
By path elements
(kind, identifier,
kind, identifier...)
Blobstore key com.google.appengine.api.blobstore.BlobKey Byte order
Null null None

For text strings and unencoded binary data (byte strings), the Datastore supports two value types:

  • Short strings (up to 500 Unicode characters or bytes) are indexed and can be used in query filter conditions and sort orders.
  • Long strings (up to 1 megabyte) are not indexed and cannot be used in query filters and sort orders.

Note: The long byte string type is named Blob in the Datastore API. This type is unrelated to blobs as used in the Blobstore API.

For values of mixed types, the Datastore uses a deterministic ordering based on the internal representations:

  1. Null values
  2. Fixed-point numbers
    • Integers
    • Dates and times
    • Ratings
  3. Boolean values
  4. Byte strings (short)
  5. Unicode strings
    • Text strings (short)
    • Postal addresses
    • Telephone numbers
    • Email addresses
    • IM handles
    • Links
    • Categories
  6. Floating-point numbers
  7. Geographical points
  8. Google Accounts users
  9. Datastore keys
  10. Blobstore keys

Note: Integers and floating-point numbers are considered separate types in the Datastore. If an entity uses a mix of integers and floats for the same property, all integers will be sorted before all floats: for example,

  7 < 3.2

Because long text strings and long byte strings are not indexed, they have no ordering defined.

Working with Entities

Applications can use the Datastore API to create, retrieve, update, and delete entities. If the application knows the complete key for an entity (or can derive it from its parent key, kind, and identifier), it can use the key to operate directly on the entity. An application can also obtain an entity's key as a result of a Datastore query; see the Queries and Indexes page for more information.

The Java Datastore API uses methods of the DatastoreService interface to operate on entities. You obtain a DatastoreService object by calling the static method DatastoreServiceFactory.getDatastoreService():

import com.google.appengine.api.datastore.DatastoreService;
import com.google.appengine.api.datastore.DatastoreServiceFactory;

// ...
DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();

Creating an Entity

In Java, you create a new entity by constructing an instance of class Entity, supplying the entity's kind as an argument to the Entity() constructor. After populating the entity's properties if necessary, you save it to the datstore by passing it as an argument to the DatastoreService.put() method. You can specify the entity's key name by passing it as the second argument to the constructor:

Entity employee = new Entity("Employee", "asalieri");
// ... set properties ...

datastore.put(employee);

If you don't provide a key name, the Datastore will automatically generate a numeric ID for the entity's key:

Entity employee = new Entity("Employee");
// ... set properties ...

datastore.put(employee);

Retrieving an Entity

To retrieve an entity identified by a given key, pass the Key object to the DatastoreService.get() method:

// Key employeeKey = ...;
Entity employee = datastore.get(employeeKey);

Updating an Entity

To update an existing entity, modify the attributes of the Entity object, then pass it to the DatastoreService.put() method. The object data overwrites the existing entity. The entire object is sent to the Datastore with every call to put().

Note: The Datastore API does not distinguish between creating a new entity and updating an existing one. If the object's key represents an entity that already exists, the put() method overwrites the existing entity. You can use a transaction to test whether an entity with a given key exists before creating one.

Deleting an Entity

Given an entity's key, you can delete the entity with the DatastoreService.delete() method:

// Key employeeKey = ...;
datastore.delete(employeeKey);

Batch Operations

The DatastoreService methods put(), get(), and delete() (and their AsyncDatastoreService counterparts) have batch versions that accept an iterable object (of class Entity for put(), Key for get() and delete()) and use it to operate on multiple entities in a single Datastore call:

import java.util.Arrays;
import java.util.List;

// ...
Entity employee1 = new Entity("Employee");
Entity employee2 = new Entity("Employee");
Entity employee3 = new Entity("Employee");
// ...

List<Entity> employees = Arrays.asList(employee1, employee2, employee3);
datastore.put(employees);

These batch operations group all the entities or keys by entity group and then perform the requested operation on each entity group in parallel. Such batch calls are faster than making separate calls for each individual entity, because they incur the overhead for only one service call. If multiple entity groups are involved, the work for all the groups is performed in parallel on the server side.

Note: A batch put() or delete() call may succeed for some entities but not others. If it is important that the call succeed completely or fail completely, use a transaction with all affected entities in the same entity group. Attempting a batch operation inside a transaction with entities or keys belonging to multiple entity groups will result in an IllegalArgumentException.

Generating Keys

Applications can use the class KeyFactory to create a Key object for an entity from known components, such as the entity's kind and identifier. For an entity with no parent, pass the kind and identifier (either a key name string or a numeric ID) to the static method KeyFactory.createKey() to create the key. The following examples create a key for an entity of kind Person with key name "GreatGrandpa" or numeric ID 74219:

Key k = KeyFactory.createKey("Person", "GreatGrandpa");

Key k = KeyFactory.createKey("Person", 74219);

If the key includes a path component, you can use the helper class KeyFactory.Builder to build the path. This class's addChild method adds a single entity to the path and returns the builder itself, so you can chain together a series of calls, beginning with the root entity, to build up the path one entity at a time. After building the complete path, call getKey to retrieve the resulting key:

Key k = new KeyFactory.Builder("Person", "GreatGrandpa")
                     .addChild("Person", "Grandpa")
                     .addChild("Person", "Dad")
                     .addChild("Person", "Me")
                     .getKey();

Class KeyFactory also includes the static methods keyToString and stringToKey for converting between keys and their string representations:

String personKeyStr = KeyFactory.keyToString(personKey);
// ...

Key    personKey = KeyFactory.stringToKey(personKeyStr);
Entity person    = datastore.get(personKey);

The string representation of a key is "web-safe": it does not contain characters considered special in HTML or in URLs.

Note: The KeyFactory.keyToString method is different from Key.toString, which returns a human-readable string suitable for use in debugging and logging. If you need a string value that can be converted to a usable key, use KeyFactory.keyToString.

Note also that a key's string representation is not encrypted: a user can decode the key string to extract its components, including the kinds and identifiers of the entity and its ancestors. If it is important to conceal this information from the user, you must encrypt the key string yourself before sending it to the user.

Understanding Write Costs

When your application executes a Datastore put() operation, the Datastore must perform a number of writes to store the entity. Your application is charged for each of these writes. You can see how many writes will be required to store an entity by looking at the data viewer in the SDK Development Console. This section explains how App Engine calculates these values.

Every entity requires a minimum of two writes to store: one for the entity itself and another for the built-in EntitiesByKind index, which is used by the query planner to service a variety of queries. In addition, the Datastore maintains two other built-in indexes, EntitiesByProperty and EntitiesByPropertyDesc, which provide efficient scans of entities by single property values in ascending and descending order, respectively. Each of an entity's indexed property values must be written to each of these indexes.

As an example, consider an entity with properties A, B, and C:

Key: 'Foo:1' (kind = 'Foo', id = 1, no parent)
A: 1, 2
B: null
C: 'this', 'that', 'theOther'

Assuming there are no composite indexes (see below) for entities of this kind, this entity requires 14 writes to store:

  • 1 for the entity itself
  • 1 for the EntitiesByKind index
  • 4 for property A (2 for each of two values)
  • 2 for property B (a null value still needs to be written)
  • 6 for property C (2 for each of three values)

Composite indexes (those referring to multiple properties) require additional writes to maintain. Suppose you define the following composite index:

Kind: 'Foo'
A ▲, B ▼

where the triangles indicate the sort order for the specified properties: ascending for property A and descending for property B. Storing the entity defined above now takes an additional write to the composite index for every combination of A and B values:

  • (1, null)
  • (2, null)

This adds 2 writes for the composite index, for a total of 1 + 1 + 4 + 2 + 6 + 2 = 16. Now add property C to the index:

Kind: 'Foo'
A ▲, B ▼, C ▼

Storing the same entity now requires a write to the composite index for each possible combination of A, B, and C values:

  • (1, null, 'this')
  • (1, null, 'that')
  • (1, null, 'theOther')

  • (2, null, 'this')
  • (2, null, 'that')
  • (2, null, 'theOther')

This brings the total number of writes to 1 + 1 + 4 + 2 + 6 + 6 = 20.

If a Datastore contains many multiple-valued properties, or if a single such property is referenced many times, the number of writes required to maintain the index can explode combinatorially. Such exploding indexes can be very expensive to maintain. For example, consider a composite index that includes ancestors:

Kind: 'Foo'
A ▲, B ▼, C ▼
Ancestor: True 

Storing a simple entity with this index present takes the same number of writes as before. However, if the entity has ancestors, it requires a write for each possible combination of property values and ancestors, in addition to those for the entity itself. Thus an entity defined as

Key: 'GreatGrandpa:1/Grandpa:1/Dad:1/Foo:1' (kind = 'Foo', id = 1, parent = 'GreatGrandpa:1/Grandpa:1/Dad:1')
A: 1, 2
B: null
C: 'this', 'that', 'theOther'

would require a write to the composite index for each of the following combinations of properties and ancestors:

  • (1, null, 'this', 'GreatGrandpa')
  • (1, null, 'this', 'Grandpa')
  • (1, null, 'this', 'Dad')
  • (1, null, 'this', 'Foo')

  • (1, null, 'that', 'GreatGrandpa')
  • (1, null, 'that', 'Grandpa')
  • (1, null, 'that', 'Dad')
  • (1, null, 'that', 'Foo')

  • (1, null, 'theOther', 'GreatGrandpa')
  • (1, null, 'theOther', 'Grandpa')
  • (1, null, 'theOther', 'Dad')
  • (1, null, 'theOther', 'Foo')

  • (2, null, 'this', 'GreatGrandpa')
  • (2, null, 'this', 'Grandpa')
  • (2, null, 'this', 'Dad')
  • (2, null, 'this', 'Foo')

  • (2, null, 'that', 'GreatGrandpa')
  • (2, null, 'that', 'Grandpa')
  • (2, null, 'that', 'Dad')
  • (2, null, 'that', 'Foo')

  • (2, null, 'theOther', 'GreatGrandpa')
  • (2, null, 'theOther', 'Grandpa')
  • (2, null, 'theOther', 'Dad')
  • (2, null, 'theOther', 'Foo')

Storing this entity in the Datastore now requires 1 + 1 + 4 + 2 + 6 + 24 = 38 writes.