First, some important stuff: I am not a security expert. Please do not think that after reading this article you will know everything you need to before implementing Single Sign-On (SSO) using SAML. It is meant to be an introduction, so that you can have an easier job understanding the details when you get them from somewhere else. In case you don’t know, SAML is short for Security Assertion Mark-up Language.
If you implement SSO wrongly you will open a large security hole into your system. As part of this, even when you understand it properly, I strongly suggest that you use an existing 3rd party library to handle the SAML details. Like any security protocol, the details are crucial, and so getting them wrong is bad news.
Also, there are other ways to do SSO than using SAML, such as OpenID. I won’t go into those at all here.
With all that doom and gloom out of the way, I will try to introduce SSO with SAML enough that other things you read about it make more sense.
What do we want?
At its simplest, what we want is to increase what an existing system does, by getting something from an external system. The two systems are independent and can communicate over the internet. This doesn’t sound so hard, in fact it seems a lot like a simple GET request. The complication is trust, or the lack of it.
You have logged in to the private part of your existing system. This means the system can trust you with access to secure content. If the stuff you wanted from the external system were public, then a simple GET request would probably work. However, you need access to private content on the external system.
How does the area of trust established by logging in to your existing system get extended to the external system, across the public internet?
- As part of logging into your own system, your credentials are checked against a list of users that your system trusts.
- After logging in, you’re allowed to access secure content e.g. a personal home page.
- You then want to access some private content on the external system, probably specific to you, but without logging in again to the external system. Between the two systems is the public internet, where it’s sensible to assume communications are open to everyone.
There is a small number of tools that will be used:
- Information exchanged ahead of time between the two systems –
- Security certificates
These will be used to enable messages to be exchanged between the two systems when a user of one system wants to access secure content on the other system.
Here is an analogy that might help you to understand how SAML works:
- Someone phones up claiming to be person P from company X. You’re not sure about this.
- You immediately end the call and look up a phone number for X that you previously received in a letter.
- When the phone is answered, you and the receptionist do a security challenge and response that you set up earlier, also via letter.
- Once you’ve established trust with the receptionist, you ask them to put you through to person P.
The flow of messages etc. isn’t identical to SAML, but similar. There is information exchanged ahead of time, that is used during the conversation. There are ways in which two parties prove their identity to each other, based on that information. With identities mutually established, one party then trusts the other to vouch for a third (you rely on the receptionist to put you through to the right person).
Cast of characters
There are three parts in the play we are about to describe:
- User / User Agent: The user is you, and the User Agent is your web browser. Depending on how picky you’re being, these are the same things or different things.
- Identity Provider (IdP): The system that already knows and trusts you, that you can sign on to.
- Service Provider (SP): The external system that you want to trust you, without your having to sign on again.
Information exchanged ahead of time
You can’t just do SSO out of the blue; preparation must be made for it to work.
The IdP must be told these by the SP:
- The name that the SP will give itself in messages that it sends (this is called the EntityId, which is usually in the form of a URL)
- The URL that will receive the assertion from the IdP about the user (the assertion is the A in SAML, and tells the SP that it can trust this user)
The SP must be told these by the IdP:
- The URL to which to send requests
- A public security certificate, so that the SP can verify that messages the appear to come from the IdP actually have come from the IdP.
Depending on how you choose to do things, the IdP will also probably need to know another URL which starts the whole interaction going. This isn’t counted as part of SAML, as it’s just a normal GET request.
Flow of messages
There is more than one way to do it. The two main variants are called IdP-initiated and SP-initiated. There is also variation in how much user data moves around, and how many tenants the SP has. Multi-tenancy is covered in another section below. If you’re interested to see what an authorisation response (step 4 below) actually looks like, there’s an article that marks up an authorisation response nicely.
IdP- vs SP-initiated
SP-initiated is a correct name that I nonetheless found a bit confusing. If the process starts with the User Agent making a normal GET request to the SP (step 1 in the diagram below), this is then followed by the SP issuing the first proper SAML message (step 2a) and then the rest of the choreography continues after that.
Because the first request isn’t part of the SAML protocol (it has no carefully constructed XML payload, for instance) then it doesn’t count. This kind of interaction is called SP-initiated, because the first SAML message comes from the SP.
Note that the initial GET request from the User Agent (step 1) will contain nothing that identifies the user. The user’s identity is added in to the process later (at step 3). Therefore, two requests from different User Agents about two separate users will look identical.
If, instead of a normal GET request starting things off, the IdP just sends a SAML message out of the blue (step 4) to the SP that says it should trust a user, then this is the IdP-initiated version.
- The SP-initiated version starts with the User Agent (web browser) asking the SP for a protected resource.
- The SP sends an authorisation request to the User Agent, which relays it to the IdP. This is an XML document, as specified by the SAML protocol.
- (The IP-initiated version starts here.) The IdP makes sure that it trusts the User Agent, in a way not specified by the SAML protocol. For instance it asks the user to log in, and then sets up a session. If the user is already logged in and the session set up, this step is effectively skipped.
- If the IdP thinks that the user is allowed to use the service provided by the SP, it sends an authorisation response to the User Agent, which relays it to the SP. This is another XML document, as specified by the SAML protocol. This does contain user data, e.g. an email address or user id.
- The SP sends a security context to the User Agent, e.g. sets up its own session with the User Agent (so there are two sessions running in parallel that include the User Agent – one with the IdP and one with the SP).
- The User Agent requests the protected resource.
- The SP responds with the protected resource.
If you look at just the green arrows and ignore the rest (the green arrows are all that the user would be aware of if they were already logged in with the IdP) then it looks like the user requests the protected resource and then receives it. The blue and black arrows in between happen in the background to make this happen.
Another variant is how much user data already exists in the SP vs. how much data is transferred from the IdP to the SP via SAML messages. If the user is already fully set up in the SP, then the SAML message from the IdP only needs to identify the user. If the user is only partially set up in the SP, or doesn’t exist at all, then the message from the IdP needs to identify the user and supply extra information about them. This is all possible in the SAML protocol.
Like I said in the introduction, I’m not a security expert, so please don’t rely on this as security advice. However, you might be curious as to how this is secure, beyond the normal HTTPS, so I shall try to explain some of the ways it’s secure. I suggest that you also refer to a good article on security-testing SAML as an alternative way of looking at this.
If you look at a SAML response e.g. in Chrome Dev Tools, it will look like gobbledigook. Don’t be confused – this is just Base64 encoding, there is no encryption going on. The good news is that with a SAML Chrome plug-in you can view the contents nicely. The bad news is that anyone can do this, i.e. it’s not secure because of this. So, how is it secured?
Each SAML response has a unique id, and so each message’s id must be checked that it hasn’t been seen before.
If you are doing SP-initiated SAML, the response will refer to the id of the request it’s replying to – these must be checked for validity.
The IdP can (and should) specify a date and time after which the message is invalid, and optionally another date and time before which the message is invalid. These should be checked, i.e. is the SP receiving the message after it has stopped being valid?
The main part of the security is the signature, e.g. a SHA256 signature. The signature is a product of a private key (held by the IdP) and the body of the message, and is appended to the end of the message body. The SP has the public key that is the partner of the private key, which lets it check that the message body and signature match. This proves that the message originally came from the IdP and hasn’t been tampered with in transit.
I’m not going to go full Tom Scott on time zones (a video that I very much recommend you watch), but time zones are relevant here. You need the IdP and the SP to agree on timezones, e.g. both be on UTC. If not, then maybe twice a year SSO with SAML will stop working. The validity period for the authorisation response will jump an hour into the past or future, i.e. it won’t be valid now. For instance, the UK and the US don’t agree on the date on which daylight saving time starts and stops, so one will change before the other.
So far, this description has assumed that there is just one pool of users, that can all be vouched for by one IdP. However, there might be two or more separate pools of users. This would happen in the SP had a relationship with many companies at once, i.e. these companies were all tenants of the same SP.
Given that there is nothing in the initial request in an SP-initiated version of SAML that can be relied on to supply information about the user or their tenant, you’ll have to think a bit about how to design things. The problem is that the SP’s response (the authorisation request) must go to an IdP address specific to the relevant tenant, so you need to know which tenant this request is for.
One way to distinguish the different IdPs is to give each tenant their own sub-domain, e.g.
This assumes that sub-domain is available, and isn’t already used for some other purpose.
I hope that this gives you an introduction to SSO with SAML. Please remember that this is just an introduction; if you want to implement SSO you now need to get the details from somewhere else. As I mentioned earlier, try to implement as little as possible yourself, by using existing security tools as much as possible.
1st March: Added section on time zones, thanks to a helpful comment from Jo.