English

Google App Engine

Task Queue Go API Overview

With the Task Queue API, applications can perform work initiated by a user request, outside of that request. If an app needs to execute some background work, it may use the Task Queue API to organize that work into small, discrete units, called tasks. The app then inserts these tasks into one or more queues. App Engine automatically detects new tasks and executes them when system resources permit.

Using Task Queues in Go

A Go app sets up queues using a configuration file named queue.yaml. See Task Queue Configuration. If an app does not have a queue.yaml file, it has a queue named default with some default settings.

To create a task, set up a Task value. The Task consists of data for a request, including a URL path, parameters, HTTP headers, and an HTTP payload. It can also include the earliest time to execute the task (the default is as soon as possible) and a name for the task.

To enqueue a task, you call the taskqueue.Add function. Once the task is added to a queue, then performed by the Task Queue service as the queue is processed.

This example defines a task handler (worker) that increments a counter in the datastore, mapped to the URL /worker. It also defines a user-accessible request handler (handler) that displays the current value of the counter for a GET request, and for a POST request enqueues a task and returns. Note that the task in this example should run at a rate no greater than once per second.

package counter

import (
    "appengine"
    "appengine/datastore"
    "appengine/taskqueue"
    "http"
    "template"
)

func init() {
    http.HandleFunc("/", handler)
    http.HandleFunc("/worker", worker)
}

type Counter struct {
    Name  string
    Count int
}

func handler(w http.ResponseWriter, r *http.Request) {
    c := appengine.NewContext(r)
    if name := r.FormValue("name"); name != "" {
        t := taskqueue.NewPOSTTask("/worker", map[string][]string{"name": {name}})
        if _, err := taskqueue.Add(c, t, ""); err != nil {
            http.Error(w, err.String(), http.StatusInternalServerError)
            return
        }
    }
    q := datastore.NewQuery("Counter")
    var counters []Counter
    if _, err := q.GetAll(c, &counters); err != nil {
        http.Error(w, err.String(), http.StatusInternalServerError)
        return
    }
    if err := handlerTemplate.Execute(w, counters); err != nil {
        http.Error(w, err.String(), http.StatusInternalServerError)
        return
    }
}

func worker(w http.ResponseWriter, r *http.Request) {
    c := appengine.NewContext(r)
    name := r.FormValue("name")
    key := datastore.NewKey(c, "Counter", name, 0, nil)
    var counter Counter
    if err := datastore.Get(c, key, &counter); err == datastore.ErrNoSuchEntity {
        counter.Name = name
    } else if err != nil {
        c.Errorf("%v", err)
        return
    }
    counter.Count++
    if _, err := datastore.Put(c, key, &counter); err != nil {
        c.Errorf("%v", err)
    }
}

var handlerTemplate = template.MustParse(handlerHTML, nil)

const handlerHTML = `
{.repeated section @}
<p>{Name}: {Count}</p>
{.end}
<p>Start a new counter:</p>
<form action="/" method="POST">
<input type="text" name="name">
<input type="submit" value="Add">
</form>
`

Note that this example is not idempotent. It is possible for the task queue to execute a task more than once. In this case, the counter is incremented each time the task is run, possibly skewing the results.

The delay package provides a convenient way to use task queues from Go applications.

Task Concepts

In App Engine background processing, a task is a complete description of a small unit of work. This description consists of two parts:

  • A data payload which parameterizes the task.
  • Code which implements the task.

As an example, consider a calendaring application which needs to notify an invitee, via email, that an event has been updated. The particular 'email notification task' for this might be defined as:

  • Task - Email Notification
    • data: the email address and name of the invitee, along with a description of the event
    • code: function which substitutes the relevant strings into an email template and then sends the mail.

Perhaps there are multiple invitees who need to be updated for a given event. The developer may choose to create a new notification task for each attendee individually:

  • Task 1
    • data: invitee1's email address
    • code: email_function (prepares email contents and sends)
  • Task 2
    • data: invitee2's email address
    • code: email_function (prepares email contents and sends)
  • Task 3
    • data: invitee3's email address
    • code: email_function (prepares email contents and sends)
  • ...

As this example shows, it is possible that multiple tasks share the same common code and differ only in their data payload. Similarly, multiple tasks may share the same data payload but reference different code, as in this for an ecommerce site:

  • Task 1 - Send Receipt to buyer
    • data: an order_description object
    • code: email_buyer_receipt function
  • Task 2 - Initiate Transaction
    • data: an order_description object (same as Task 1)
    • code: charge_order function

More examples of tasks include:

  • Feed Aggregator. A feed reader application needs to fetch, automatically and periodically, the contents of various news feeds from across the Internet. A single task might consist of:
    • data: the URL of a feed and the timestamp when it was last checked
    • code: a function which performs a URL Fetch to retrieve a feed, parse its contents, and insert new items into the database

Tasks as Offline Web Hooks

For App Engine to support concrete task instances, two mechanisms are needed:

  • Data Storage: a language-agnostic container for arbitrary data
  • Code Reference: a language-agnostic mechanism for referencing arbitrary code, along with a means to pass in parameters.

Fortunately, the Internet provides such a solution already, in the form of an HTTP request and its response. The data payload is the content of the HTTP request, such as web form variables, XML, JSON, or encoded binary data. The code reference is the URL itself; the actual code is whatever logic the server executes in preparing the response.

Revisiting the calendar app example above, the tasks can be revised as:

  • Task 1
    • data: an HTTP POST message containing a form variable of email_address = attendee1's email
    • code: a URL which, when requested with the HTTP POST, executes code on the server side that sends the mail

Using this model, App Engine's Task Queue API allows you to specify tasks as HTTP Requests (both the contents of the request as its data, and the target URL of the request as its code reference). Programmatically referring to a bundled HTTP request in this fashion is sometimes called a "web hook."

Importantly, the offline nature of the Task Queue API allows you to specify web hooks ahead of time, without waiting for their actual execution. Thus, an application might create many web hooks at once and then hand them off to App Engine; the system will then process them asynchronously in the background (by 'invoking' the HTTP request). This web hook model enables efficient parallel processing. App Engine may invoke multiple tasks, or web hooks, simultaneously.

To summarize, the Task Queue API allows a developer to execute work in the background, asynchronously, by chunking that work into offline web hooks. The system invokes those web hooks on the application's behalf, scheduling for optimal performance by possibly executing multiple webhooks in parallel. This model of granular units of work, based on the HTTP standard, allows App Engine to efficiently perform background processing in a way that works with any programming language or web application framework.

Worker URLs and Task Names

As mentioned above, a task references its implementation via URL. For example, a task which fetches and parses an RSS feed might use a worker URL called /app_worker/fetch_feed. With the App Engine Task Queue API, you may use any URL as the worker for a task, so long as it is within your application; all task worker URLs must be specified as relative URLs.

t := taskqueue.NewPOSTTask("/path/to/worker", nil)
if _, err := taskqueue.Add(c, t, ""); err != nil {
	// handle error
}

In addition to a Task's contents (its data payload and worker URL), it is also possible to specify a Task's name. Once a Task with name N is written, any subsequent attempts to insert a Task named N will fail. While this is generally true, Task names do not provide an absolute guarantee of once-only semantics. In rare cases, multiple calls to create a task of the same name may succeed. It's also possible in exceptional cases for a task to run more than once even if it was only created once. If you need a guarantee of once-only semantics, use the datastore.

If a task is created successfully, it will eventually be deleted (at least seven days after the task successfully executes). Once deleted, its name can be reused.

Task Request Headers

Requests from the Task Queue service contain the following HTTP headers:

  • X-AppEngine-QueueName, the name of the queue (possibly default)
  • X-AppEngine-TaskName, the name of the task, or a system-generated unique ID if no name was specified
  • X-AppEngine-TaskRetryCount, the number of times this task has been retried; for the first attempt, this value is 0

URLs for Tasks

You can prevent users from accessing URLs of tasks by restricting access to administrator accounts. Task queues can access admin-only URLs. You can restrict a URL by adding login: admin to the handler configuration in app.yaml.

An example might look like this in app.yaml:

application: hello-tasks
version: 1
runtime: go
api_version: 3

handlers:
- url: /tasks/process
  script: _go_app
  login: admin

Note: While task queues can use URL paths restricted with login: admin, they cannot use URL paths restricted with login: required.

For more information see Application Configuration: Requiring Login or Administrator Status.

To test a task web hook, sign in as an administrator and visit the URL of the handler in your browser.

Task Execution

Once you have created a task and inserted it into a queue for processing, App Engine executes it as soon as possible, unless you specify scheduling criteria. The lifetime of a single task's execution is limited to 10 minutes. If your task's execution nears the 10 minute limit, App Engine raises an exception, which you may catch and then save your work or log progress.

When the developer inserts a new task into a queue, the order in which that task executes (relative to other tasks) is governed by the contents and properties of that queue. However, it is possible to specify certain properties (such as an ETA) which request special scheduling on a per-task basis.

If a task fails to execute (by returning any HTTP status code outside of the range 200-299), App Engine retries until it succeeds. The system backs off gradually to avoid flooding your application with too many requests, but schedules retry attempts for failed tasks to recur at a maximum of once per hour.

You can also configure your own scheme for task retries using the .

When implementing the code for tasks (as worker URLs within your app), it is important to consider whether the task is idempotent. App Engine's Task Queue API is designed to only invoke a given task once, however it is possible in exceptional circumstances that a task may execute multiple times (e.g. in the unlikely case of major system failure). Thus, your code must ensure that there are no harmful side-effects of repeated execution.

If a task performs sensitive operations (such as modifying important data), you may wish to secure the worker URL to prevent a malicious external user from calling it directly.

Deleting Tasks Programmatically

You can delete individual tasks from the Admin Console, or you can programmatically purge entire queues using the taskqueue.Purge function:

taskqueue.Purge(c, "foo")

It takes the backends about 20 seconds to notice that the queue has been purged. Tasks continue to execute during this time.

Queue Concepts

Thus far, this document has explained how tasks can be used to encapsulate small chunks of work for asynchronous execution. When used in large numbers, tasks provide an efficient and powerful tool for background processing. However, you might need to manage the execution of these tasks. In particular, you might need to control the rate at which tasks are invoked, so as not to exhaust resources.

The Task Queue API provides queues as a container for tasks. All new tasks must be inserted into a queue; you can influence task execution by modifying properties of the queue. As an example, you might want to ensure that your system sends no more than ten emails per second. You can accomplish this by using a queue called email-throttle, which you configure with a rate of ten invocations per second. Within your app's code, you make it so that all tasks which send email are inserted into this email-throttle queue. The App Engine system respects your configuration on the email-throttle queue and invokes its tasks at a rate of less than or equal to ten per second, even if thousands of tasks are inserted at a single instant.

Beyond influencing the rate of task execution, queues also provide the ordering in which tasks are consumed by the system (not withstanding the special task scheduling parameters). Fundamentally, queues deliver a best effort FIFO order (first in, first out). A developer creates new tasks and inserts them into the tail of the queue, and the system removes tasks from the head for execution. However, the system attempts to deliver the lowest latency possible for any given task via specially optimized notifications to the scheduler. Thus, in the case that a queue has a large backlog of tasks, the system's scheduling may "jump" new tasks to the head of the queue.

Although a queue defines a general FIFO ordering, tasks are not executed entirely serially. Multiple tasks from a single queue may be executed simultaneously by the scheduler, so the usual locking and transaction semantics need to be observed for any work performed by a task.

It is important to understand that queues are a mechanism for manipulating tasks in aggregate; queues do not dictate the contents of any given task. It is possible for a single queue to contain many different types of tasks, which have varying data payloads and worker URLs.

The Default Queue

For convenience, App Engine provides a default queue to all applications. You can use this queue immediately without any additional configuration. This queue automatically has a throughput rate of five task invocations per second, however you may configure its properties in the same fashion as any user-defined queue with configuration for a queue named default. Code may always insert new tasks into the default queue, but if you wish to disable execution of these tasks, you may do so by adding it to your configuration and lowering its rate to zero.

Queue Default URLs

You can specify the worker URL for a task by passing it to the Task object constructor. If you do not specify a worker URL, the task uses a default worker URL named after the queue:

/_ah/queue/queue_name

As an example, for a queue named email-worker-queue, you may implement a default request handler at /_ah/queue/email-worker-queue (within your application). Any new task inserted into the email-worker-queue which does not have a worker URL of its own (in other words, it's purely data with no code reference) is invoked using the URL of /_ah/queue/email-worker-queue.

A queue's default URL is used if, and only if, a task does not have a worker URL of its own. If a task does have its own worker URL, then it is only invoked at the default URL, never another. (Once inserted into a queue, a task is immutable).

Please note that if a task does not have a worker URL, then the task is invoked against the queue's default URL even if there is currently no handler defined for the queue's default URL! In this case, the system attempts to invoke the task with a nonexistent URL which fails with a 404 (this 404, along with the exact URL that was tried, will be available in your application's logs). Due to the failure state of this 404, the system saves the task and retries it until it is eventually successful. You can clear (or 'purge') tasks that can't complete successfully using the Administration Console.

Tasks and App Versions

All versions of an application share the same task queues. The version of the app used to perform a task depends on how the task was enqueued.

If the version of the app that enqueues a task is the default version (http://your_app_id.appspot.com or a Google Apps domain), the queue uses the default version of the app to perform the task, even if the default version has changed since the task was enqueued. If the app enqueues a task then the default version is changed, the queue uses the task's URL path with the new default version of the app to perform the task.

If the version of the app that enqueues a task is not the default version when the task is enqueued (such as http://3.latest.your_app_id.appspot.com/), the queue will use that version of the app to perform the task, regardless of which version is the default version.

Task Queues in the Admin Console

You can manage task queues for an application using the Administration Console. You can use the Console to:

  • List queues and their configuration.
  • Inspect the tasks currently waiting to be executed.
  • Pause the execution of a queue.
  • Run individual tasks immediately. If you click the "Run Now" button next to a task, App Engine executes the task even if the queue is paused. Running tasks immediately is useful for diagnosing application errors.
  • Manually delete individual tasks or every task in a queue. This is useful if a task cannot be completed successfully and is stuck waiting to be retried.

To manage queues, sign in to the Administration Console and select "Task Queues." Note that the default queue only appears in the Console after the app has enqueued a task to it for the first time.

Task Queues and the Development Server

http://localhost:8080/_ah/admin/taskqueue

To execute tasks, select the queue by clicking on its name, then select the tasks to execute. To clear a queue without executing any tasks, click the "Purge Queue" button.

Quotas and Limits

Enqueuing a task counts toward the following quotas:

  • Task Queue Stored Task Count
  • Task Queue Stored Task Bytes

The Task Queue Stored Task Bytes quota is configurable in This quota counts towards your Stored Data (billable) quota.

Execution of a task counts toward the following quotas:

  • Requests
  • CPU Time
  • Incoming Bandwidth
  • Outgoing Bandwidth

The act of executing a task consumes bandwidth-related quotas for the request and response data, just as if the request handler were called by a remote client. When the task queue processes a task, the response data is discarded.

Once a task has been executed or deleted, the storage used by that task is reclaimed. The reclaiming of storage quota for tasks happens at regular intervals, and this may not be reflected in the storage quota immediately after the task is deleted.

For more information on quotas, see Quotas, and the "Quota Details" section of the Admin Console.

In addition to quotas, the following limits apply to the use of Task Queues:

Limit Amount
task object size 10 kilobytes
number of active queues (not including the default queue) 10 for free apps
100 for billed apps
queue execution rate 50 task invocations per second per queue
maximum countdown/ETA for a task 30 days from the current date and time
maximum number of tasks that can be added in a batch 100 tasks