Google Code offered in: English - Español - 日本語 - 한국어 - Português - Pусский - 中文(简体) - 中文(繁體)
Alon Levi
August 2011, updated November 2011
As part of App Engine’s new pricing changes, we've updated the set of resources that are included in an application's usage report. We've eliminated CPU-Hours, and have moved to a system that accounts for the number of Instance Hours (Frontend and Backend) used and the number of API calls made, in addition to storage and bandwidth. For more information, please see our FAQ
Before the new pricing went live, we released side-by-side bills that allowed you to see how this will affect your bill. Now that the new pricing is live, the current usage reports reflect your usage under this new model. You can use the information in your usage report to see what resources types you might optimize in your application's usage. After making any changes, you will see that information reflected in the subsequent usage reports.
In this article, we'll show you how to read the new usage report, and explain some of the strategies you can employ to manage your resources and how they might impact your application's performance.
Daily usage reports for App Engine can be found in the Admin Console on the Billing History page, located at: https://appengine.google.com/billing/history?&app_id=$APP_ID
. If you click on the [+] sign beside one of the reports, this will expand one day's details so you can see the old and new resources. Here's what a preview usage report might look like for your application:
Now we'll go through the line items in the new bill and explain what they mean, suggest some strategies you can use to manage resources, and explain what these strategies could mean for your application's performance.
The first two line items on the new bill deal with application instance usage. You can read about instances in our documentation. You can also see the number of instances that are being used to serve your application in the Admin Console at https://appengine.google.com/instances?&app_id=$APP_ID
, or by selecting the "Instances" graph from the dropdown on your application's Dashboard at https://appengine.google.com/dashboard?&app_id=$APP_ID
.
App Engine uses a scheduling algorithm to decide how many instances are necessary to serve the amount of traffic your application is receiving. With each request that your application receives we make a decision to serve it with an available instance (either one that is idle or accepts concurrent requests), put the request in a pending request queue, or start a new instance for that request. We make this decision by looking at your available instances, how quickly your application has been serving requests recently (its latency), and how long it takes for a new instance of your application to initialize and start serving requests. In most cases, when we think we can serve a request quicker with a new instance than with an existing instance, we start up a new instance to serve the incoming requests.
Of course, an application's traffic is never steady, so the scheduler also keeps track of the number of idle instances for your app. These idle instances can be useful for serving spikes in traffic without user-visible latency. If the scheduler determines that your application has too many idle instances, it will reclaim resources by tearing down some of the unused instances.
Since the latency of your application has a huge impact on how many instances are required to handle your traffic, decreasing application latency can have a large effect on how many instances we use to serve your application. Here are a few things you can do to decrease application latency:
<async-session-persistence enabled="true"/>
to your appengine-web.xml
. Session data is always written synchronously to memcache, and if a request tries to read the session data when memcache is not available it will fail over to the datastore, which may not yet have the most recent update. This means there is a small risk your application will see stale session data, but for most applications the latency benefit far outweighs the risk.In the Admin Console's Application Settings page, we have two sliders available that let you set some of the variables that the scheduler uses to manage your application's instances. Here's a brief explanation of how you can use these to have more control over the trade-off between performance and resource usage for your application:
In our 1.4.3 release, we introduced the ability for your application's instances to serve multiple requests concurrently in Java. Enabling this setting will decrease the number of instances needed to serve traffic for your application, but your application must be threadsafe in order for this to work correctly. Read about how to enable concurrent requests in our Java documentation.
Note: All Go instances have concurrent requests enabled automatically.
Note: Muli-threading for Python will not be available until the launch of Python 2.7, which is on our roadmap. In Python 2.7, multithreaded instances can handle more requests at a time and do not have to idly consume Instance Hour quota while waiting for blocking API requests to return. Since Python does not currently support the ability to serve more than one request at a time per instance, and to allow all developers to adjust to concurrent requests, we will be providing a 50% discount on frontend Instance Hours until November 20, 2011. The Python 2.7 is currently in the Trusted Tester phase.
In Admin Console's Billing Settings page, you can specify a number of reserved instances. While this does not help to decrease the number of instances your application uses, it can help reduce the amount you are billed for instance usage.
The default settings for the Task Queue are tuned for performance. With these defaults, when you put several tasks into a queue simultaneously, they will likely cause new Frontend Instances to spin up. Here are some suggestions for how to tune the Task Queue to conserve Instance Hours:
Static content serving (Java, Python) is handled by specialized App Engine infrastructure, which does not consume Instance Hours.
If you need to set custom headers, use the Blobstore API (Java, Python, Go). The actual serving of the Blob response does not consume Instance Hours.
App Engine calculates storage costs based on the size of entities in the datastore, the size of datastore indexes, the size of tasks in the task queue, and the amount of data stored in Blobstore. Here are some things you can do to make sure you don't store more data than necessary:
Under the new model, we will be accounting for the number of operations performed in the Datastore (instead of the CPU consumed for those operations, as we do now). Here are a few strategies that can result in reduced Datastore resource consumption, as well as lower latency for requests to the datastore:
https://appengine.google.com/datastore/indexes?&app_id=$APP_ID
.get()
s with a batch get()
.For Outgoing Bandwidth, one way to reduce usage is to, whenever possible, set the appropriate Cache-Control
header on your responses and set reasonable expiration times (Java, Python) for static files. Using public Cache-Control
headers in this way will allow proxy servers and your clients' browser to cache responses for the designated period of time.
Incoming Bandwidth is more difficult to control, since that's the amount of data your users are sending to your app. However, this is a good opportunity to mention our DoS Protection Service for Python and Java, which allows you block traffic from IPs that you consider abusive.
The last items on the report are the usages for the Email, XMPP, and Channel APIs. For these APIs, your best bet is to make sure you are using them effectively. One of the best strategies for auditing your usage of these APIs is to use Appstats (Python, Java) to make sure you're not making more calls than are necessary. Also, it's always a good idea to make sure you are checking your error rates and looking out for any invalid calls you might be making. In some cases it might be possible to catch those calls early.