English

Google App Engine

Using the Datastore

Storing data in a scalable web application can be tricky. A user could be interacting with any of dozens of web servers at a given time, and the user's next request could go to a different web server than the one that handled the previous request. All web servers need to be interacting with data that is also spread out across dozens of machines, possibly in different locations around the world.

With Google App Engine, you don't have to worry about any of that. App Engine's infrastructure takes care of all of the distribution, replication and load balancing of data behind a simple API—and you get a powerful query engine and transactions as well.

The default datastore for an application is now the High Replication datastore. This datastore uses the Paxos algorithm to replicate data across datacenters. The High Replication datastore is extremely resilient in the face of catastrophic failure.

One of the consequences of this is that the consistency guarantees for the datastore may differ from what you are familiar with. It also differs slightly from the Master/Slave datastore, the other datastore option that App Engine offers. In the example code comments, we highlight some ways this might affect the design of your app. For more detailed information, see Using the High Replication Datastore.

The datastore writes data in objects known as entities, and each entity has a key that identifies the entity. Entities can belong to the same entity group, which allows you to perform a single transaction with multiple entities. Entity groups have a parent key that identifies the entire entity group.

In the High Replication Datastore, entity groups are also a unit of consistency. Queries over multiple entity groups may return stale, eventually consistent results. Queries over a single entity group return up-to-date, strongly consistent, results. Queries over a single entity group are called ancestor queries. Ancestor queries use the parent key (instead of a specific entity's key).

The code samples in this guide organize like entities into entity groups, and use ancestor queries on those entity groups to return strongly consistent results. In the example code comments, we highlight some ways this might affect the design of your app. For more detailed information, see Using the High Replication Datastore.

Note: If you built your application using an earlier version of this Getting Started Guide, please note that sample application has changed. You can still find the sample code for the original Guestbook application, which does not use ancestor queries, in the demos directory of the SDK.

The App Engine datastore is one of several services provided by App Engine with two APIs: a standard API, and a low-level API. By using the standard APIs, you make it easier to port your application to other hosting environments and other database technologies, if you ever need to. Standard APIs "decouple" your application from the App Engine services. App Engine services also provide low-level APIs that exposes the service capabilities directly. You can use the low-level APIs to implement new adapter interfaces, or just use the APIs directly in your app.

App Engine includes support for two different API standards for the datastore: Java Data Objects (JDO) and Java Persistence API (JPA). These interfaces are provided by DataNucleus Access Platform, an open source implementation of several Java persistence standards, with an adapter for the App Engine datastore.

For clarity getting started, we'll use the low-level API to retrieve and post messages left by users.

Updating Our Servlet to Store Data

Here is an updated version of src/SignGuestbookServlet.java that stores greetings in the datastore. We will discuss the changes made here below.

package guestbook;

import com.google.appengine.api.datastore.DatastoreService;
import com.google.appengine.api.datastore.DatastoreServiceFactory;
import com.google.appengine.api.datastore.Entity;
import com.google.appengine.api.datastore.Key;
import com.google.appengine.api.datastore.KeyFactory;
import com.google.appengine.api.users.User;
import com.google.appengine.api.users.UserService;
import com.google.appengine.api.users.UserServiceFactory;

import java.io.IOException;
import java.util.Date;
import java.util.logging.Logger;

import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

public class SignGuestbookServlet extends HttpServlet {
    private static final Logger log =
            Logger.getLogger(SignGuestbookServlet.class.getName());

    public void doPost(HttpServletRequest req, HttpServletResponse resp)
            throws IOException {
        UserService userService = UserServiceFactory.getUserService();
        User user = userService.getCurrentUser();

        // We have one entity group per Guestbook with all Greetings residing
        // in the same entity group as the Guestbook to which they belong.
        // This lets us run an ancestor query to retrieve all Greetings for a
        // given Guestbook. However, the write rate to each Guestbook should be
        // limited to ~1/second.
        String guestbookName = req.getParameter("guestbookName");
        Key guestbookKey = KeyFactory.createKey("Guestbook", guestbookName);
        String content = req.getParameter("content");
        Date date = new Date();
        Entity greeting = new Entity("Greeting", guestbookKey);
        greeting.setProperty("user", user);
        greeting.setProperty("date", date);
        greeting.setProperty("content", content);

        DatastoreService datastore =
                DatastoreServiceFactory.getDatastoreService();
        datastore.put(greeting);

        resp.sendRedirect("/guestbook.jsp?guestbookName="
                + guestbookName);
    }
}

Storing the Submitted Greetings

The low-level Datastore API for Java provides a schemaless interface for creating and storing entities. The low-level API does not require entities of the same kind to have the same properties, nor for a given property to have the same type for different entities. The following code snippet constructs the Greeting entity in the same entity group as the guestbook to which it belongs:

        Entity greeting = new Entity("Greeting", guestbookKey);
        greeting.setProperty("user", user);
        greeting.setProperty("date", date);
        greeting.setProperty("content", content);

In our example, each Greeting has the posted content, and also stores the user information about who posted, and the date on which the post was submitted. When initializing the entity, we supply the entity name, Greeting, as well as a guestbookKey argument that sets the parent of the entity we are storing. Objects in the datastore that share a common ancestor belong to the same entity group.

After we construct the entity, we instantiate the datastore service, and put the entity in the datastore:

        DatastoreService datastore =
                DatastoreServiceFactory.getDatastoreService();
        datastore.put(greeting);

Because querying in the High Replication datastore is only strongly consistent within entity groups, we assign all Greetings to the same entity group by setting the same parent for each Greeting. This means a user will always see a Greeting immediately after it was written. However, the rate at which you can write to the same entity group is limited to 1 write to the entity group per second. When you design a real application, you'll need to keep this fact in mind. Note that by using services such as Memcache, you can mitigate the chance that a user won't see fresh results when querying across entity groups immediately after a write.

Updating the JSP

We also need to modify the JSP we wrote earlier to display Greetings from the datastore, and also include a form for submitting Greetings. Here is our updated guestbook.jsp:

<%@ page contentType="text/html;charset=UTF-8" language="java" %>
<%@ page import="java.util.List" %>
<%@ page import="com.google.appengine.api.users.User" %>
<%@ page import="com.google.appengine.api.users.UserService" %>
<%@ page import="com.google.appengine.api.users.UserServiceFactory" %>
<%@ page import="com.google.appengine.api.datastore.DatastoreServiceFactory" %>
<%@ page import="com.google.appengine.api.datastore.DatastoreService" %>
<%@ page import="com.google.appengine.api.datastore.Query" %>
<%@ page import="com.google.appengine.api.datastore.Entity" %>
<%@ page import="com.google.appengine.api.datastore.FetchOptions" %>
<%@ page import="com.google.appengine.api.datastore.Key" %>
<%@ page import="com.google.appengine.api.datastore.KeyFactory" %>

<html>
  <head>
    <link type="text/css" rel="stylesheet" href="/stylesheets/main.css" />
  </head>

  <body>

<%
    String guestbookName = request.getParameter("guestbookName");
    if (guestbookName == null) {
        guestbookName = "default";
    }
    UserService userService = UserServiceFactory.getUserService();
    User user = userService.getCurrentUser();
    if (user != null) {
%>
<p>Hello, <%= user.getNickname() %>! (You can
<a href="<%= userService.createLogoutURL(request.getRequestURI()) %>">sign out</a>.)</p>
<%
    } else {
%>
<p>Hello!
<a href="<%= userService.createLoginURL(request.getRequestURI()) %>">Sign in</a>
to include your name with greetings you post.</p>
<%
    }
%>

<%
    DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
    Key guestbookKey = KeyFactory.createKey("Guestbook", guestbookName);
    // Run an ancestor query to ensure we see the most up-to-date
    // view of the Greetings belonging to the selected Guestbook.
    Query query = new Query("Greeting", guestbookKey).addSort("date", Query.SortDirection.DESCENDING);
    List<Entity> greetings = datastore.prepare(query).asList(FetchOptions.Builder.withLimit(5));
    if (greetings.isEmpty()) {
        %>
        <p>Guestbook '<%= guestbookName %>' has no messages.</p>
        <%
    } else {
        %>
        <p>Messages in Guestbook '<%= guestbookName %>'.</p>
        <%
        for (Entity greeting : greetings) {
            if (greeting.getProperty("user") == null) {
                %>
                <p>An anonymous person wrote:</p>
                <%
            } else {
                %>
                <p><b><%= ((User) greeting.getProperty("user")).getNickname() %></b> wrote:</p>
                <%
            }
            %>
            <blockquote><%= greeting.getProperty("content") %></blockquote>
            <%
        }
    }
%>

    <form action="/sign" method="post">
      <div><textarea name="content" rows="3" cols="60"></textarea></div>
      <div><input type="submit" value="Post Greeting" /></div>
      <input type="hidden" name="guestbookName" value="<%= guestbookName %>"/>
    </form>

  </body>
</html>

Retrieving the Stored Greetings

The low-level Java API provides a Query class for constructing queries and a PreparedQuery class for fetching and returning the entities that match the query from the datastore. The code that fetches the data is here:

    Query query = new Query("Greeting", guestbookKey).addSort("date", Query.SortDirection.DESCENDING);
    List<Entity> greetings = datastore.prepare(query).asList(FetchOptions.Builder.withLimit(5));

This code creates a new query on the Greeting entity, and sets the guestbookKey as the required parent entity for all entities that will be returned. We also sort on the date property, returning the newest Greeting first.

After you construct the query, it is prepared and returned as a list of Entity objects. For a description of the Query and PreparedQuery interfaces, see the Datastore reference.

A Word About Datastore Indexes

Every query in the App Engine datastore is computed from one or more indexes. Indexes are tables that map ordered property values to entity keys. This is how App Engine is able to serve results quickly regardless of the size of your application's datastore. Many queries can be computed from the builtin indexes, but the datastore requires you to specify a custom index for some, more complex, queries. Without a custom index, the datastore can't execute the query efficiently.

Our guest book example above, which filters results by ancestor and orders by date, uses an ancestor query and a sort order. This query requires a custom index to be specified in your application's datastore-indexes.xml file. When you run your application in the SDK, it will automatically add an entry to this file. When you upload your application, the custom index definition will be automatically uploaded, too. The entry for this query will look like:

<?xml version="1.0" encoding="utf-8"?>
<datastore-indexes
  autoGenerate="true">
    <datastore-index kind="Greeting" ancestor="true">
        <property name="date" direction="desc" />
    </datastore-index>
</datastore-indexes>

You can read all about datastore indexes in the Introduction to Indexes section.

Clearing the Datastore

The development web server uses a local version of the datastore for testing your application, using local files. The data persists as long as the temporary files exist, and the web server does not reset these files unless you ask it to do so.

The file is named local_db.bin, and it is created in your application's WAR directory, in the WEB-INF/appengine-generated/ directory. To clear the datastore, delete this file.

Next...

Every web application returns dynamically generated HTML from the application code, via templates or some other mechanism. Most web applications also need to serve static content, such as images, CSS stylesheets, or JavaScript files. For efficiency, App Engine treats static files differently from application source and data files. Let's create a CSS stylesheet for this application, as a static file.

Continue to Using Static Files.