Google Code offered in: English - Español - 日本語 - 한국어 - Português - Pусский - 中文(简体) - 中文(繁體)
App Engine applications can communicate with other applications or access other resources on the web by fetching URLs. An app can use the URL Fetch service to issue HTTP and HTTPS requests and receive responses. The URL Fetch service uses Google's network infrastructure for efficiency and scaling purposes.
You can use the Python standard libraries urllib, urllib2 or httplib to make HTTP requests. When running in App Engine, these libraries perform HTTP requests using App Engine's URL fetch service, which runs on Google's scalable HTTP request infrastructure.
import urllib2 url = "http://www.google.com/" try: result = urllib2.urlopen(url) doSomethingWithResult(result) except urllib2.URLError, e: handleError(e)
You can also access the URL fetch service using its Python API. In this API, the urlfetch.fetch() function performs an HTTP request.
from google.appengine.api import urlfetch url = "http://www.google.com/" result = urlfetch.fetch(url) if result.status_code == 200: doSomethingWithResult(result.content)
The URL fetch service supports five HTTP methods: GET
, POST
, HEAD
, PUT
and DELETE
. The request can include HTTP headers, and body content for a POST
or PUT
request. For example, to submit data to a web form handler using the POST
action using the URL fetch API:
import urllib from google.appengine.api import urlfetch form_fields = { "first_name": "Albert", "last_name": "Johnson", "email_address": "Albert.Johnson@example.com" } form_data = urllib.urlencode(form_fields) result = urlfetch.fetch(url=url, payload=form_data, method=urlfetch.POST, headers={'Content-Type': 'application/x-www-form-urlencoded'})
An app can fetch a URL using HTTP (normal) or HTTPS (secure). The URL specifies the scheme to use: http://...
or https://...
The URL to be fetched can use any port number in the following ranges: 80-90, 440-450, 1024-65535. If the port is not mentioned in the URL, the port is implied by the scheme: http://...
is port 80, https://...
is port 443.
The fetch can use any of the following HTTP methods: GET
(common for requesting web pages and data), POST
(common for submitting web forms), PUT
, HEAD
, and DELETE
. The fetch can include HTTP request headers and a payload (an HTTP request body).
The URL Fetch service uses an HTTP/1.1 compliant proxy to fetch the result.
To prevent an app from causing an endless recursion of requests, a request handler is not allowed to fetch its own URL. It is still possible to cause an endless recursion with other means, so exercise caution if your app can be made to fetch requests for URLs supplied by the user.
You can set a deadline for a request, the most amount of time the service will wait for a response. By default, the deadline for a fetch is 5 seconds. The maximum deadline is 60 seconds for HTTP requests and 10 minutes for task queue and cron job requests.
An app can fetch a URL with the HTTPS method to connect to secure servers. Request and response data are transmitted over the network in encrypted form.
In the Python API, the proxy by default does not validate the host it is contacting. The proxy server cannot detect "man in the middle" attacks between App Engine and the remote host when using HTTPS. However, you can add an optional validate_certificate argument to the fetch()
method to enable host validation. The urllib module currently provides no methods to validate hosts, but will default to host validation in the near future.
An app can set HTTP headers for the outgoing request.
When sending an HTTP POST request, if a Content-Type
header is not set explicitly, the header is set to x-www-form-urlencoded
. This is the content type used by web forms.
For security reasons, the following headers cannot be modified by the application:
Content-Length
Host
Vary
Via
X-Forwarded-For
X-ProxyUser-IP
These headers are set to accurate values by App Engine, as appropriate. For example, App Engine calculates the Content-Length
header from the request data and adds it to the request prior to sending.
The URL Fetch service returns all response data, including the response code, header and body.
By default, if the URL Fetch service receives a response with a redirect code, the service will follow the redirect. The service will follow up to 5 redirect responses, then return the final resource. You can use the API to tell the URL Fetch service to not follow redirects and just return a redirect response to the application.
By default, if the incoming response exceeds the maximum response size limit, the URL fetch service raises an exception. (See below for the amount of this limit.) You can tell the API to truncate the response instead of raising an exception. Note that if you use the urllib, urllib2 or httplib libraries to fetch URLs, the response will always be truncated instead of raising an exception.
Your application can connect to systems behind your company's firewall using the Google Secure Data Connector (SDC). With the SDC Agent set up on your network, App Engine applications running on your Google Apps domain can authenticate with the Agent and access URLs on your intranet. The SDC Agent ensures that only your applications can connect to your intranet, and that they will do so only for users signed in using an Apps account on your domain.
Your app can access an intranet URL using the URL Fetch service. The app includes a special header with its request that declares that the request is intended for your SDC Agent. The header has a name of use_intranet
and a value of yes
. No other changes are needed; user verification, authentication, and the secure connection are handled automatically.
Here is an example of how to use the URL Fetch service with the use_intranet
header using the urlfetch
API:
from google.appengine.api import urlfetch result = urlfetch.fetch(url="http://www.corp.example.com/sales.csv", headers={'use_intranet': 'yes'}) if result.status_code == 200: parseCSV(result.content)
For more information, see the Google Secure Data Connector website.
When your application is running in the development server on your computer, calls to the URL Fetch service are handled locally. The development server fetches URLs by contacting remote hosts directly from your computer, using whatever network configuration your computer is using to access the Internet.
When testing the features of your app that fetch URLs, be sure that your computer can access the remote hosts.
If your app is using the Google Secure Data Connector to access URLs on your intranet, be sure to test your app while connected to your intranet behind the firewall. Unlike App Engine, the development server does not use the SDC Agent to resolve intranet URLs. Only Google Apps and App Engine can authenticate with your SDC Agent.
Each URL Fetch request counts toward the URL Fetch API Calls quota.
Data sent in an HTTP or HTTPS request using the URL Fetch service counts toward the following quotas:
In addition to these quotas, data sent in an HTTPS request also counts toward the following quota:
Data received in response to an HTTP or HTTPS request using the URL Fetch service counts toward the following quotas:
In addition to these quotas, data received in response to an HTTPS request also counts toward the following quota:
For more information on quotas, see Quotas, and the "Quota Details" section of the Admin Console.
In addition to quotas, the following limits apply to the use of the URL Fetch service:
Limit | Amount |
---|---|
request size | 5 megabytes |
response size | 32 megabytes |
maximum deadline (request handler) | 60 seconds |
maximum deadline (task queue and cron job handler) | 10 minutes |