Principles for Standardized REST Authentication

By George Reese
December 26, 2009 | Comments: 31

Working with the programming APIs for cloud providers and SaaS vendors has taught me two things:

  • There are very few truly RESTful programming APIs.
  • Everyone feels the need to write a custom authentication protocol.

I've programmed against more web services interfaces than I can remember. In the last month alone, I've written to web services APIs for Aria, AWS, enStratus, GoGrid, the Rackspace Cloud, VMOps, Xero, and Zendesk.

Each one requires a different authentication mechanism. Two of them—Aria and AWS—defy all logic and require different authentication mechanisms for different parts of their respective APIs.

Let's end this here and now*. I'm tired of wasting brain cycles figuring out whether vendor A requires you to sign your query before or after you URL encode your parameters and I am fed up with vendors who insist on using interactive user credentials to authenticate API calls. Here's a set of standards that I think should be in place for any REST authentication scheme.

Here's the summary:

  1. All REST API calls must take place over HTTPS with a certificate signed by a trusted CA. All clients must validate the certificate before interacting with the server.
  2. All REST API calls should occur through dedicated API keys consisting of an identifying component and a shared, private secret. Systems must allow a given customer to have multiple active API keys and de-activate individual keys easily.
  3. All REST queries must be authenticated by signing the query parameters sorted in lower-case, alphabetical order using the private credential as the signing token. Signing should occur before URL encoding the query string.

The Connection

All REST API calls must take place over HTTPS with a certificate signed by a trusted CA. All clients must validate the certificate before interacting with the server.

Encryption certainly adds overhead to every API call. Meaningful authentication, however, begins with an encrypted session. You can't securely authenticate with the server unless the authentication credentials are encrypted. Furthermore, a well-designed REST API should not work like local method calls—it should be capable of dealing with the overhead imposed by encryption.

Let's just be blunt: if you aren't encrypting your API calls, you aren't even pretending to be secure.

The second part may be a bit more controversial for the people who think that SSL is about encryption. Encryption is only part of SSL. Through the use of certificates signed by a trusted authority, SSL also protects you against "man-in-the-middle" attacks in which an agent inserts itself between client and server and sniffs the "encrypted" traffic.

If you are not validating the SSL certificate of the server, you don't know who is receiving your REST queries.

Authentication Credentials

All REST API calls should occur through dedicated API keys consisting of an identifying component and a shared, private secret. Systems must allow a given customer to have multiple active API keys and de-activate individual keys easily.

The first important thing is that a system making a REST query is NOT an interactive user. You should therefore NEVER require or allow the user of interactive user credentials for REST authentication. The only exception is a client/server application in which the client represents an interactive user and the protocol between client and server is a RESTful API.

Why not?

Well, as I mentioned above, your REST client is not an interactive user. Customers can fudge this by creating a dedicated account expressly for the purpose of REST authentication. Unfortunately, this enables attackers to go after the interactive account as an attack vector for compromising the overall system. Furthermore, because REST is authenticating a program and not person, it allows for stronger authentication than human user ID/password schemes allow.

If you don't use a dedicated user account, you open up all kinds of problems. Zendesk's REST authentication, for example, is a REST API that leverages one of your Zendesk admin accounts. If my Zendesk admin leaves the company or changes roles, I have to remember to alter my applications to reflect those changes. While I could use a dedicated account to get around this problem, Zendesk charges me for each admin account in addition to the extra money I am paying for API access.

The second part says that each REST server should support multiple API keys for each customer. This requirement makes it simpler to isolate potential compromises and address them when they happen.

Ten different applications by one customer talking to a given API is the same thing as ten different users of one customer interacting with a user interface. You would not have those ten users sharing one user ID/password, so why would you have the ten different applications sharing a single set of REST credentials? If one of those applications has been compromised, do you really want to go through the process of having to re-configure all of them for new keys?

When an application is compromised, you also need an elegant way to roll out replacement API keys. You therefore need the ability to have both the old and new keys active for a short period of time while you re-configure the application. Once the application is reconfigured, the server system should allow you to de-activate the compromised keys.

Query Authentication

All REST queries must be authenticated by signing the query parameters sorted in lower-case, alphabetical order using the private credential as the signing token. Signing should occur before URL encoding the query string.

In other words, you don't pass the shared secret component of the API key as part of the query, but instead use it to sign the query. Your queries end up looking like this:

GET /object?timestamp=1261496500&apiKey=Qwerty2010&signature=abcdef0123456789

The string being signed is "/object?apikey=Qwerty2010×tamp=1261496500" and the signature is the HMAC-SHA256 hash of that string using the private component of the API key.

The main objection to this approach is that the private API key devolves into a kind of password for static calls. For example, if the query were instead:

GET /object?apiKey=Qwerty2010

The signature would be the same every time you made that specific query. However, you are using SSL, right? Furthermore, adding in a timestamp makes each query differ. For extra security, you can make the timestamp a more formal date-time value with time zone information and disallow queries outside of the query range.

The real controversy is whether signing should occur before or after URL encoding values. There is no "right" answer. I lean towards signing before encoding because most programming tools make it easier on the server side to get the unencoded values versus the encoded values. I'm sure good arguments can be made the other way. What I really care about is this: let's pick one and stick with it.

Summary

This is a battle I know I am going to lose. After all, people still can't settle on being truly RESTful (just look at the AWS EC2 monstrosity of an API). Authentication is almost certainly a secondary consideration. If you are reading this post and just don't want to listen to my suggestions, I plead with you to follow someone else's example and not roll your own authentication scheme.

*Yes, I know. I am being delusional here. Please humor me in my delusion.


You might also be interested in:

31 Comments

One thing I really don't get with all the APIs you mentioned (including yours) is that: HTTP supports HTTP-Auth. Proxies can handle it. Endpoints can handle it. Browsers can handle it.

I support the need to sign GET and POST-Parameters, but signing/HMAC does not replace authentication credentials.

So I have to ask you: what is wrong with HTTP-Auth?

If you are going to use https, why not use it fully, and ask for client side certificates too? Then you get a fully RESTful authentication method, because the client and the server are authenticated at the connection layer, and there is no need to bring authentication into the URI level. Clean separation of concerns.

The counter argument is that client certificates are expensive to produce and not widely used. It turns out that in fact they can be dead cheap to use ($0 because they don't require a C.A.) and very easy to install for the end user. By a simple trick one can allow users to have a few certificates useable globally.

The description of this way of using client side certificates is available here:

http://esw.w3.org/topic/foaf+ssl

If you are interested in trying this out, ask for directions on the mailing list referred to from there.

So are there examples or libraries to make this easy to do in Scripting languages like Ruby, Python, PHP, etc? That would make one less barrier to doing this kind of best practice...

Have you looked at OAuth?

OAuth is most definitely NOT a solution for the typical RESTful application.

OAuth solves a very narrow problem set:

Where a single user application like Facebook or Twitter needs to grant access on behalf of that single user to another application.

In other words, OAuth authentication enables a third-party application to act on behalf of a specific user without the need to share any authentication credentials with that third-party application.

The problem with the type of applications I am talking about (enStratus -> AWS ) is that neither system represents a user. One or both systems may represent multiple users that change in terms of existence and access rights. OAuth does not solve this problem.

The reason why different services have different requirements is because they have different use cases. It is not necessary to have the kind of ironclad security you're specifying for most read-only APIs (for example). Key-signing protocols HTTPS will only serve to increase the costs to provide the API and the cost to adopt, which ultimately means that fewer APIs will be available.

If there weren't 100 different versions of doing the same thing, there would not be much overhead.

Implementing this scheme takes only 1/2 hour, and you can develop language-specific libraries to do the heavy lifting.

Signing URIs blows caching, bookmarking, and sharing of links. Instead, take the same patterns you outline here and use the WWW-Authenticate and Authorization headers.

There is more then enough room for Authentication extensions via headers where this control information belongs.

We are talking about API calls here, there is no bookmarking or sharing of these links.

I appreciate your effort, but I don't like the fact that this would pour concrete over the worst of the REST-like practices, leaving us stuck for a long time with crappy URLs like

http://example.org/api/GET?key=123&type=person&facet=data&format=json

instead of the better

http://example.org/people/123/data.json

As if having URL parameters is such a horrible thing.

Including a timestamp in each request URI simply so that signature is not the same every time is a terrible idea. Consider:

GET /object?timestamp=1261496500&apiKey=Qwerty2010&signature=abcdef0123456789

With this pattern, *none* of the GETs for a resource would be cache-able, even if that resource never changes, simply because you are including a different timestamp with each request just to be able to sign it differently.

This not only goes against HTTP and best practices for caching GET requests, but seems foolish as none of the query parameters are actual parameters for a query on the server. If the resource supports actual parameters (say, an index that supports filtering), you would have:

GET /people/index?city=Chicago×tamp=1261496500&apiKey=Qwerty2010&signature=abcdef0123456789

Parameters such as timestamp and signature cause confusion at best, and unexpected results at worst (if a person's information contains their signature, this would alter the meaning of the filter).

Lastly, proclaiming that people disagreeing with you implies that "people still can't settle on being truly RESTful" is a bit too self-centered, don't you think?

1. I hate server side caching of REST calls. Never seen it happen in a way that doesn't cause trouble. Having said that, I am not saying that a timestamp parameter should be required, simply useful for systems that need an extra layer of security.

2. What about the phrase, "people still can't settle on being truly RESTful" has anything to do with people disagreeing with me?

regarding #1. Just because you have had issues with Server side caching doesn't mean others require them. No spec should eliminate a foundation of web architecture. There is no reason the apikey and signature cannot be used as the credentials in HTTP-Auth. RESTful URLS certainly can contain query parameters (and timestamp if the use case requires it) but authentication shouldn't pollute the resource URL.

1. I hate server side caching of REST calls. Never seen it happen in a way that doesn't cause trouble. Having said that, I am not saying that a timestamp parameter should be required, simply useful for systems that need an extra layer of security.

What about intermediate caching or transparent proxying?

And what's a REST "call" when it's at home? Sounds way to RPCish to me.

I like Henry Story's idea of using client-side certificates. The only issue could be support in libraries and languages used to communicate with the server. And for anyone to implement this, we'd need clear and simple examples of how to authenticate with client-side certificates in all the popular programming environments and languages.

Client-side certificates has too much process overhead to be of practical use in a massively integrated web service like a AWS.

So for the sake of consistency, you want to break any potential of caching, proxying or any of the other foundational principles that the REST architectural style and constraints provide. Thus losing the benefits of loose coupling, component assembly and scalability that REST provides.

Have you read Fieldings thesis?

It appears that you believe that everything RESTful is encapsulated in the URL.

Your solution for authentication uses a protocol that was designed primarily for encryption, even though HTTP already provides perfectly adequate headers and even provides the ability to transparently extend those headers.

HTTPS just encapsulates HTTP in SSL. It provides end-point to end-point encryption. With client side certificates, it provides the server with the same identification that the client side gets from having CA certificates cached to check the server.

If you use HTTPS for all RESTful interactions, there is no possibility of caching other than at the end-points. Proxying becomes impossible.

Putting authentication details in the URL precludes bookmarking.

All in all, I don't agree :)

First of all, I am not suggesting that "timestamp" should be required for any such standardization. I am saying it is something that can be done optionally to add an extra layer of security to the signature.

And what the hell is it about bookmarking? REST URLs aren't for bookmarking.

You may not agree with putting authentication information in the URL, but a vast majority of APIs are already doing it.

What's wrong with putting the signature in the HTTP headers instead of the URL? The URL is for naming the resource being manipulated. There's even existing headers specially designed for authentication. See WWw-Authenticate, Authorization, and RFC 2617 et al regarding Digest Access Authentication. You can include timestamps in the "nonce" that the server generates.

REST URLs aren't for bookmarking? What other sort of URLs are there? Unless you mean RPC/SOAP like "endpoint" URLs.

Our RESTful applications (client and server) store URLs for resources (and their current state) in the local data store.

The applications use those URLs and HATEOS to determine the next state of the application. Do a GET on the URL, follow the links in the representation returned. Of course, the applications are aware of the representation format, so know which links to follow.

I don't agree with authentication information in the URL because it's the wrong place for it. URLs name resources, the HTTP headers are where things like authentication, content negotiation etc are supposed to be.

Just because people have misunderstood REST as a style and insist on imposing RPC-like approaches via the URL doesn't mean that we should accomodate it.

It's just as dumb as SOAP which is RPC/XDR over port 80 using XML instead of XDR. RPC has been a brittle paradigm from when it started, through CORBA/DCOM etc and on to SOAP and the mess that is WS-*.

The article says : "All REST API calls should occur through dedicated API keys consisting of an identifying component and a shared, private secret".

Does this mean that the server has to store this shared secret in clear ? If crypted, how to check the signature is right then ?

No, you do not have to (and in fact) should not store any of the keys in the clear.

You need a key management system that will enable you to decrypt on the keys on the fly, encrypt the parameters for comparison, and securely discard the encryption key.

So, some insiders could steal the private key of the key managment system and decrypt all passwords ! This is the weak point of this system.

Wouldn't an asymmetric system be more secure ?

A salted/hashed password stored and compared to a salted/hashed one would avoid all this insecure decryption/deletion of a clear shared secret.

The client app must store the identifying component and the shared private secret. If the client is a mobile app (or any other app distributed in binary form) then these two pieces of information can be decompiled and stolen.

Imagine a Twitter iPhone app that uses a 3rd party paid spell-checking web service. Another app developer that wants to use the spell-checking service for free can decompile this app and steal the identifying component and private secret.

How would you recommend handling this distributed app problem, where the app uses a 3rd party RESTful web service?

There's another security problem with RESTful APIs that no one seems to talk about. It seems to be taken for granted that we pass query string parameters to web methods. The trouble comes when you need to pass sensitive data (think bank account numbers, medical data, etc.) Yes, SSL will encrypt this in transit and signing will protect against tampering. However, the query string will show up in web logs once on the web server. Many organizations have regulatory requirements that prohibit storing such sensitive data in unencrypted form and web logs are typically just unencrypted text files.

So far there are three potential mitigations I've thought of for this:

1. Encrypt the query string parameters. (PKI adds a performance penalty while symmetric algorithms introduce key management issues.)

2. Pass the parameters as HTTP header values instead. (Not sure if this is kosher in REST.)

3. Use some sort of file system level encryption on the web logs. (Not ideal since the web service process account and web server admins can still access the logs.)

I'm no REST guru so I'd like to hear what others think about these options and if there are even better ones that I haven't considered.

The query parameters (or URL parameters more generally) only really need identify the resource in a RESTful system. So, unless your are identifying resources by private data (bank account #, medical data, etc), you won't have this problem.

The real sensitive data will appear in the body of the POST/PUT request to create/edit the resource, or the body of the GET command to retrieve the resource.

So instead of referencing a bank user by their account number, i.e. /bank_users/ACCOUNT_NUMBER, you could generate an API key for that user and reference that user as /bank_users/API_KEY, thereby preventing any sensitive data from ending up the server access logs.

I don't think including a timestamp without doing the additional check for validity within a time range really gives you any additional security.

Without the timestamp, yes the signature will always be the same, and so it the request could be "replayed" as long as the appkey is valid.

With the timestamp, the signature does change, however if the service isn't checking the request is still valid indefinitely.

The timestamp only serves as a level of obfuscation but I don't think it really adds any additional security.

George,
If caching doesn't work for you it is because your services are stateful, right?

You mention "alphabetical order". It appear the default for WCF RESTful services is to require the XML nodes to be sorted alphabetically - or is this only for the authentication nodes. Is this true and is this typically a specification?

lol wow, look all this comments!
I came here after googling for good REST API practices.

and what do I found? A series of good practices, and a bunch of fools that prove developers STILL don't understand the RESTful API concept.

All the complains prove they don't understand the problem, hell, they don't even understand whats the difference between a REST API and a regular REST request (like visiting a web page)
My favorite complain is the one talking about "No proxie! No caching! No bookmarking! Bad idea!"
That had me laughing for a while. hey fellows! heres a nice API link! no signing, nor timestamps! http://openlibrary.org/api/books?bibkeys=ISBN:0201558025,LCCN:93005405
Bookmark it! hell. do it your homepage!

the author says, and I agree, the timestamp should not be used always. I personally use it when it implies changing something (which should be done with POST/PUT/DELETE and not GET anyway)

For reading-only resources, only signing is sufficient, and using GET is fine too.

if the resource is completely public and read-only, well, then no authentication is required, no signing, no timestamp and GET is the way to go.

But is really necessary to explain it? isn't obvious?

You don't know me but I really hope people say "good work" to you sometimes and understand the amount of hard work and time that goes into creating a good website like yours - Thanks!!

News Topics

Recommended for You

Got a Question?