Google Code offered in: English - Español - 日本語 - 한국어 - Português - Pусский - 中文(简体) - 中文(繁體)
App Engine's scheduler is responsible for routing incoming requests to be served by your app's instances. Sometimes, the volume of incoming requests exceeds the capacity of the instances currently available to your app. When this happens, incoming requests may have to wait in the Pending Queue until busy instances become available, or until the scheduler starts new instances.
The scheduler is responsible for deciding how to serve your app's request load. Under regular conditions, it may spin up new idle instances to absorb traffic and minimize latency in the event of a sudden load spike. However, because new instances take time to create, unusually heavy surges of traffic may consume all available idle instances faster than the scheduler can create new ones. This can cause your users to experience delays (latency) in the serving of requests.
In some cases, you may want to optimize your application to minimize cost. In other cases, you may want to prime it to serve heavy request volume. The Administration Console's Performance controls allow you to balance this potential latency against the cost of maintaining additional idle instances to handle the load. These controls allow you to set:
The default settings enable App Engine's scheduling algorithm to scale the number of instances based on your recent request load and latency profile. If you use manual settings instead, you may need to adjust them continually as your request volume changes.
App Engine provides three different classes of frontend instances, each with different memory and and CPU limits. These classes allow you to configure your frontend instance with the processing capacity you need to perform your work. Each class has a specific hourly billing rate. Please see Billable Quota Unit Costs for Billing.
Important: Currently, when you are billed for instance hours, you will not see any instance classes in your billing line items. Instead, you will see the appropriate multiple of instance hours. For example, if you use an F4 instance for one hour, you do not see "F4" listed, but you will see billing for four instance hours at the F1 rate.
The default class for frontends is F1, which gives you 128MB of memory and 600MHz of CPU capacity. You can change the class of the frontend using the performance settings in the admin console.
The Frontend Instance Class setting is selected by choosing one of the values in the dropdown menu. Each value represents a memory size and processing power, with larger memory sizes and processing power providing extra performance but at an increased cost. The value you select is used for all of the instances used by all versions of your app.
Note: Specifying the frontend instance class does not affect backend instances.
You can change the current frontend instance class for your app at any time. Python and Go apps automatically get the new instance class that you choose. A Java app must be restarted to get the new instance class.
Frontend instances are priced based on an hourly rate determined by the frontend class. The following table describes the cost for each class:
Frontend class | Memory limit | CPU limit | Cost per hour per instance |
---|---|---|---|
F1 (default) |
128MB | 600MHz | $0.08 |
F2 |
256MB | 1.2GHz | $0.16 |
F4 |
512MB | 2.4GHz | $0.32 |
The Idle Instances sliders control the minimum and maximum number of idle instances available to your application at any given time.
The upper slider sets the minimum number of idle instances:
Note: In order to specify the minimum number of idle instances, you must have a paid app.
Note: If you set a minimum number of idle instances, the pending latency slider will have less effect on your application's performance. Because App Engine keeps idle instances in reserve, it is unlikely that requests will enter the pending queue except in exceptionally high load spikes. You will need to test your application and expected traffic volume to determine the ideal number of instances to keep in reserve.
The lower slider controls the maximum number of idle instances (up to 100):
Note: When settling back to normal levels after a load spike, the number of idle instances may temporarily exceed your specified maximum. However, you will not be charged for more instances than the maximum number you've specified.
Note: In order to specify the maximum pending latency, you must have a paid app."
Pending request latency arises when all of your application's available instances are too busy to serve new requests. When this happens, incoming requests go to a pending request queue. The scheduler automatically starts new instances for pending requests, but you can also specify minimum and maximum latency settings manually. These settings control how long a request waits in the pending queue when there are no available instances: no less than the minimum, and no more than the maximum.
The App Engine scheduler never creates a new instance for a request that has been pending for less than the specified minimum latency. Once the minimum is reached, an instance may be started to serve the request at any time. If the request is still pending when the specified maximum latency is reached, App Engine immediately starts a new instance to serve it.
Note: If you set a minimum number of idle instances, the Pending Latency controls will have little or no effect on your app's performance. See Minimum Idle Instances for more information.
The upper Pending Latency slider sets the minimum period of time (at least 10 milliseconds) that requests must wait for an idle instance to be available before being served. (If an instance is available, the request will be served immediately.)
The lower slider sets the maximum period of time (at most 15 seconds) a pending request will wait in the queue before a new instance is started to serve it: