English

Google App Engine

Retrieving Authenticated Google Data Feeds with Google App Engine

Jeff Scudder, Google Developer Programs
April 17, 2008, updated October 22, 2008

Introduction

I'm sure your mind is positively buzzing with ideas for how to use Google App Engine, and a few of you might be interested in building an app that interacts with some of Google's other services via our Google Data AtomPub APIs. Quite of few of Google's products expose a Google Data API, (a few interesting examples are YouTube, Google Calendar, and Blogger--you can find a complete list here) and these APIs can be used to read and edit the user-specific data they expose.

In this article we'll use the Google Documents List Data API to walk through the process of requesting access from and retrieving data for a particular user. We'll use Google App Engine's webapp framework to generate the application pages, and the Users API to authenticate users with Google Accounts.

Google's AuthSub APIs

Some Google Data services require authorization from your users to read data, and all Google Data services require their authorization before your app can write to these services on the user's behalf. Google uses AuthSub to enable users to authorize your app to access specific services.

Using AuthSub, users type their password into a secure page at google.com, then are redirected back to your app. Your app receives a token allowing it to access the requested service until the user revokes the token through the Account Management page.

In this article, we'll walk through the process of setting up the login link for the user, obtaining a session token to use for multiple requests to Google Data services, and storing the token in the datastore so that it can be reused for returning users.

Using the gdata-python-client library

Google offers a Google Data Python client library that simplifies token management and requesting data from specific Google Data APIs. We recently released a version of this library that supports making requests from Google App Engine applications. In this article we'll use this library, but of course you're welcome to use whatever works best for your application. Download the gdata-python-client library.

To use this library with your Google App Engine application, simply place the library source files in your application's directory, and import them as you usually would. The source directories you need to upload with your application code are src/gdata and src/atom. Then, be sure to call the gdata.alt.appengine.run_on_appengine function on each instance of a gdata.service.GDataService object. There's nothing more to it than that!

Step 1: Generating the Token Request Link

Applications use an API called AuthSub to obtain a user's permission for accessing protected Google Data feeds. The process is fairly simple. To request access from a user to a protected feed, your app will redirect the user to a secure page on google.com where the user can sign in to grant or deny access. Once doing so, the user is then redirected back to your app with the newly-granted token stored in the URL.

Your application needs to specify two things when using AuthSub: the common base URL for the feeds you want to access, and the redirect URL for your app, where the user will be sent after authorizing your application.

To generate the token request URL, we'll use the gdata.service module included in the Google Data client library. This module contains a method, GenerateAuthSubURL, which automatically generates the correct URL given the base feed URL and your website's return address. In the code snippet below, we use this method to generate a URL requesting access to a user's Google Document List feed.

In our app.yaml file, we will define a URL mapping to create a separate URL for each step. Here's an example:

application: gdata-feedfetcher
version: 1
runtime: python
api_version: 1

handlers:
- url: /step1.*
  script: step1.py

- url: /step2.*
  script: step2.py

...

This sample application also includes a settings module to allow you to easily change the host name for your app. By default the settings module attempts to detect the current server settings.

import os


port = os.environ['SERVER_PORT']
if port and port != '80':
    HOST_NAME = '%s:%s' % (os.environ['SERVER_NAME'], port)
else:
    HOST_NAME = os.environ['SERVER_NAME']

To illustrate this first step of using AuthSub in the app, we will create a step1.py script that looks something like this:

from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from google.appengine.api import users
import atom.url
import gdata.service
import gdata.alt.appengine
import settings


class Fetcher(webapp.RequestHandler):

    def get(self):
        next_url = atom.url.Url('http', settings.HOST_NAME, path='/step1')

        # Initialize a client to talk to Google Data API services.
        client = gdata.service.GDataService()
        gdata.alt.appengine.run_on_appengine(client)

        # Generate the AuthSub URL and write a page that includes the link
        self.response.out.write("""<html><body>
            <a href="%s">Request token for the Google Documents Scope</a>
            </body></html>""" % client.GenerateAuthSubURL(next_url,
                ('http://docs.google.com/feeds/',), secure=False, session=True))


def main():
    application = webapp.WSGIApplication([('/.*', Fetcher),], debug=True)
    run_wsgi_app(application)

if __name__ == '__main__':
    main()

In this example, the first URL passed to GenerateAuthSubURL returns the user to our application, and the second is the Google Documents List feed base URL, which indicates which service our app is requesting authorization for. After you click the link and authorize your application, you will be taken back to the same original page but the single use AuthSub token will now be a URL parameter in the appspot URL.

Step 2: Retrieving and Updating a Token

Once we've generated an authorization request URL for a particular Google Data service, we'll need a way to use the token returned to our app to access the feed in question. Now, we need to retrieve the initial token returned to us for the Google Documents List API, and upgrade that token to a permanent session token. Remember that we told the service to redirect the user to the URL http://gdata-feedfetcher.appspot.com/step1. Let's extend our simple example above to do a few things. We'll call this new version step2.

Let's write the functionality that will handle the return request from the Google Data service the user signed in to. The Google Data service will request a URL that will look something like this:

http://gdata-feedfetcher.appspot.com/?auth_sub_scopes=http%3A%2F%2Fdocs.google.com%2Ffeeds%2F&token=CKC5y...MgH

Which is just our return URL appended with the initial authorization token for the service which grants our app access for our user. The code below first takes this URL and extracts the service and the token. Then, it requests an upgrade for the token for the document list service.

We use two new methods to achieve this. First, we try to obtain the single use AuthSub token by examining the current page's URL. The gdata.auth.extract_auth_sub_token_from_url function handles token extraction for us. To upgrade this initial token to a session token, we use client.upgrade_to_session_token(auth_token).

Now that we have a session token which grants our app access to the user's Google Documents, our app needs to decide if this token should be stored in the datastore for future use. If the user is signed in to our app, we can associate this AuthSub token with the current user and store it, so that the user will not need to repeat the authorization process the next time that they use our app. If our app does not know who the current user is, we should not store the token.

The code for making this decision and telling the Google Data client how to use the token looks like this:

if session_token and users.get_current_user():
    client.token_store.add_token(session_token)
elif session_token:
    client.current_token = session_token

Below is the code for step2.py which illustrates upgrading to a session token and storing it for future use if there is a signed-in user.

Note: With Google App Engine, you must use the URLFetch API to request external URLs. In our Google Data Python client library, gdata.service does not use the URLFetch API by default. We have to tell the service object to use URLFetch by calling gdata.alt.appengine.run_on_appengine on the service object, like this: gdata.alt.appengine.run_on_appengine(client)

from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from google.appengine.api import users
import atom.url
import gdata.service
import gdata.alt.appengine
import settings


class Fetcher(webapp.RequestHandler):

    def get(self):
        # Write our pages title
        self.response.out.write("""<html><head><title>
            Google Data Feed Fetcher: read Google Data API Atom feeds</title>""")
        self.response.out.write('</head><body>')
        # Allow the user to sign in or sign out
        next_url = atom.url.Url('http', settings.HOST_NAME, path='/step2')
        if users.get_current_user():
            self.response.out.write('<a href="%s">Sign Out</a><br>' % (
                users.create_logout_url(str(next_url))))
        else:
            self.response.out.write('<a href="%s">Sign In</a><br>' % (
                users.create_login_url(str(next_url))))

        # Initialize a client to talk to Google Data API services.
        client = gdata.service.GDataService()
        gdata.alt.appengine.run_on_appengine(client)

        session_token = None
        # Find the AuthSub token and upgrade it to a session token.
        auth_token = gdata.auth.extract_auth_sub_token_from_url(self.request.uri)
        if auth_token:
            # Upgrade the single-use AuthSub token to a multi-use session token.
            session_token = client.upgrade_to_session_token(auth_token)
        if session_token and users.get_current_user():
            # If there is a current user, store the token in the datastore and
            # associate it with the current user. Since we told the client to
            # run_on_appengine, the add_token call will automatically store the
            # session token if there is a current_user.
            client.token_store.add_token(session_token)
        elif session_token:
            # Since there is no current user, we will put the session token
            # in a property of the client. We will not store the token in the
            # datastore, since we wouldn't know which user it belongs to.
            # Since a new client object is created with each get call, we don't
            # need to worry about the anonymous token being used by other users.
            client.current_token = session_token

        self.response.out.write('<div id="main"></div>')
        self.response.out.write(
            '<div id="sidebar"><div id="scopes"><h4>Request a token</h4><ul>')
        self.response.out.write('<li><a href="%s">Google Documents</a></li>' % (
            client.GenerateAuthSubURL(
                next_url,
                ('http://docs.google.com/feeds/',), secure=False, session=True)))
        self.response.out.write('</ul></div><br/><div id="tokens">')


def main():
    application = webapp.WSGIApplication([('/.*', Fetcher),], debug=True)
    run_wsgi_app(application)


if __name__ == '__main__':
    main()

After we upgrade the initial token using the GDataService.upgrade_to_session_token method, we can associate the session token with the current user and store it in the datastore for future reuse by calling client.token_store.add_token(session_token). If we do not store the AuthSub session token, the user will need to perform the AuthSub authorization redirects each time that they use our application. If there is no current user, the session token will not be stored, since we don't know who the token belongs to. Below, we will take you through the steps to use this token and fetch your user's feed in your application.

Step 3: Using a session token and fetching a data feed.

Now that we have obtained and stored the session token, we can use the AuthSub session token to retrieve the user's document list feed with our application. The final step is to get the user feed from Google Docs and display it on our site!

Lets add a new method to our app to request the feed and handle a token required message. We will call this method fetch_feed and add it to the Fetcher request handler class. In this example, the app uses client.Get to try to read data from the feed.

Some Google Data feeds require authorization before they can be read. If our app had previously saved an AuthSub session token for the current user and the desired feed URL, then the token will be found automatically by the client object and used in the request. If we did not have a stored token for the combination of the current user and the desired feed, then we will attempt to fetch the feed anyway. If we receive a "token required" message from the server, then we will ask the user to authorize this app which will give our app a new AuthSub token.

Here is the code for the fetch_feed method:

    def fetch_feed(self, client, feed_url):
        # Attempt to fetch the feed.
        if not feed_url:
            self.response.out.write(
                'No feed_url was specified for the app to fetch.<br/>')
            example_url = atom.url.Url('http', settings.HOST_NAME, path='/step3',
                params={'feed_url':
                    'http://docs.google.com/feeds/documents/private/full'}
                ).to_string()
            self.response.out.write('Here\'s an example query which will show the'
                ' XML for the feed listing your Google Documents <a '
                'href="%s">%s</a>' % (example_url, example_url))
            return
        try:
            response = client.Get(feed_url, converter=str)
            self.response.out.write(cgi.escape(response))
        except gdata.service.RequestError, request_error:
            # If fetching fails, then tell the user that they need to login to
            # authorize this app by logging in at the following URL.
            if request_error[0]['status'] == 401:
                # Get the URL of the current page so that our AuthSub request will
                # send the user back to here.
                next = atom.url.Url('http', settings.HOST_NAME, path='/step3',
                    params={'feed_url': feed_url})
                # If there is a current user, we can request a session token, otherwise
                # we should ask for a single use token.
                auth_sub_url = client.GenerateAuthSubURL(next, feed_url,
                    secure=False, session=True)
                self.response.out.write('<a href="%s">' % (auth_sub_url))
                self.response.out.write(
                    'Click here to authorize this application to view the feed</a>')
            else:
                self.response.out.write(
                    'Something went wrong, here is the error object: %s ' % (
                        str(request_error[0])))

The above method uses the cgi module, so be sure to add import cgi to the list of imports at the beginning of your script.

Now that we have a method to fetch the target feed, we can modify the Fetcher class' get method to call this method after we upgrade and store the AuthSub token.

Our app also needs to know the URL which should be fetched, so we add a URL parameter to the incoming request to indicate which feed should be fetched. The below code for the get method adds the ability to find out which URL the app should fetch and fetches the desired feed.

import cgi
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from google.appengine.api import users
import atom.url
import gdata.service
import gdata.alt.appengine
import settings


class Fetcher(webapp.RequestHandler):

    def get(self):
        # Write our pages title
        self.response.out.write("""<html><head><title>
            Google Data Feed Fetcher: read Google Data API Atom feeds</title>""")
        self.response.out.write('</head><body>')
        next_url = atom.url.Url('http', settings.HOST_NAME, path='/step3')
        # Allow the user to sign in or sign out
        if users.get_current_user():
        self.response.out.write('<a href="%s">Sign Out</a><br>' % (
            users.create_logout_url(str(next_url))))
        else:
        self.response.out.write('<a href="%s">Sign In</a><br>' % (
            users.create_login_url(str(next_url))))

        # Initialize a client to talk to Google Data API services.
        client = gdata.service.GDataService()
        gdata.alt.appengine.run_on_appengine(client)

        feed_url = self.request.get('feed_url')

        session_token = None
        # Find the AuthSub token and upgrade it to a session token.
        auth_token = gdata.auth.extract_auth_sub_token_from_url(self.request.uri)
        if auth_token:
            # Upgrade the single-use AuthSub token to a multi-use session token.
            session_token = client.upgrade_to_session_token(auth_token)
        if session_token and users.get_current_user():
            # If there is a current user, store the token in the datastore and
            # associate it with the current user. Since we told the client to
            # run_on_appengine, the add_token call will automatically store the
            # session token if there is a current_user.
            client.token_store.add_token(session_token)
        elif session_token:
            # Since there is no current user, we will put the session token
            # in a property of the client. We will not store the token in the
            # datastore, since we wouldn't know which user it belongs to.
            # Since a new client object is created with each get call, we don't
            # need to worry about the anonymous token being used by other users.
            client.current_token = session_token

        self.response.out.write('<div id="main">')
        self.fetch_feed(client, feed_url)
        self.response.out.write('</div>')
        self.response.out.write(
            '<div id="sidebar"><div id="scopes"><h4>Request a token</h4><ul>')
        self.response.out.write('<li><a href="%s">Google Documents</a></li>' % (
            client.GenerateAuthSubURL(
                next_url,
                ('http://docs.google.com/feeds/',), secure=False, session=True)))
        self.response.out.write('</ul></div><br/><div id="tokens">')

In the above, we request the feed by calling self.fetch_feed(client, feed_url).

You can see the final program at work by visiting: http://gdata-feedfetcher.appspot.com/. Also, view the complete source code, where we put all of this together at the Google App Engine sample code project on Google Code Hosting.

The AuthSub session tokens are long lived, but they can be revoked by the user or by your application. At some point, a session token stored in your data store may become revoked so your application should handle cleanup of tokens which can no longer be used. The status of a token can be tested by querying the token info URL. You can read more about AuthSub token management in the AuthSub documentation. This feature is left as an exercise to the reader, have fun :)

Conclusion

Using the Google Data Python client library, you can easily manage your user's Google Data feeds in your own Google App Engine application.

The Google Data Python client library includes support for almost all of the Google Data services. For further information, you can read the getting started guide for the library, visit the project to browse the source, and even ask questions on the gdata-python-client's Google group.

As always, for questions about Google App Engine, read our online documentation and visit our google group.

Appendix: ClientLogin

This example uses AuthSub to authorize the app to act on the user's bahalf, but the Google Data APIs support other authorization mechanisms. In some cases, you might want to temporarily use ClientLogin while developing your application. If you are going to use ClientLogin, you'll need to add a couple of parameters to the run_on_appengine command:

client = gdata.service.GDataService()
# Tell the client that we are running in single user mode, and it should not
# automatically try to associate the token with the current user then store
# it in the datastore.
gdata.alt.appengine.run_on_appengine(client, store_tokens=False,
    single_user_mode=True)
client.email = 'app_account_username@example.com'
client.password = 'password'
# To request a ClientLogin token you must specify the desired service using
# its service name.
client.service = 'blogger'
# Request a ClientLogin token, which will be placed in the client's
# current_token member.
client.ProgrammaticLogin()

You may receive a CAPTCHA challenge when requesting a ClientLogin token which you will need to handle in your app before you can receive a ClientLogin token. For this and a few other reasons, I don't recommend using ClientLogin in Google App Engine, but the above is how you could use it while developing your app.