This is fantastic. I've implemented OAuth2 4-5 times over the past few years. It seams overly complicated at first, but over time I've come to understand more of the vulnerabilities each piece of complexity exists to mitigate. Every time I've thought I could do it simpler, I eventually discovered a vulnerability in my approach. You get basically all that knowledge here in a concise blog post. This is going to be the first thing I link anyone to if they're interested in learning OAuth.
One gripe from Attack #3[0]:
> The solution is to require Pied Piper to register all possible redirect URIs first. Then, Hooli should refuse to redirect to any other domain.
There's actually a another (better IMO) solution to this problem, though to date it's rarely used. Instead of requiring client registration, you can simply let clients provide their own client_id, with the caveat that it has to be a URI, and a strict prefix of the redirect_uri. Then you can show it to the user so they know exactly where the token is being sent, but you also cut out a huge chunk of complexity from your system. I learned about it here[1].
> Then you can show it to the user so they know exactly where the token is being sent
Unfortunately, most major websites end up hosting an endpoint that will redirect users to a separate URL provided as a query parameter. This means that users may easily be misled about where the token is, in fact, being sent.
Is this true? Do you know of any major ones offhand? That would be surprising to me.
Thanks for sharing [0]. I found its discussion of shared subdomain cookies useful. However, I believe all the vulnerabilities in the OAuth section would be mitigated by using PKCE and not using the implicit flow, even if you leave the open redirect. Am I missing anything there?
As for open redirects in general, it is an important problem. As an authorization server, if you want to protect against clients that might have an open redirect (and as you indicate eventually one will), while still using the simple scheme I mentioned above, I can think of a few options:
1. Require the client_id to exactly match the redirect_uri instead of just a prefix. This is probably the most secure, but can result in ugly client IDs shown to the user, like "example.com/oauth2/callback". Of course clients can control that and make it something prettier if they want.
2. Strip any query params from the redirect_uri, and document this behavior. That should handle most cases, but it's always possible clients implement an open redirect in the path itself somehow. You could also check for strings like "http", but at some point there's only so much you can do.
3. Require clients to implement client metadata[1], so you can get back to exact string matches for redirect_uri. This is a very new standard, and also doesn't work for localhost clients.
Does that count as an open redirect? It gives a big fat declaration where you're coming from and where you're going to, and requires the user to choose.
I agree some nonzero number of users would click to continue when they shouldn't while doing an OAuth flow. Thanks for the example.
Tho if you do this then clients need a distinct client id per redirect uri domain, mostly this is non consequential and a good thing but I think it has some ux implications like seeing multiple consent screens vs just 1 if you had multiple redirect uris registered against a single client id.
Just web complexity really, there are lots of reasons you might have multiple domains as a business.
Language-specific domains, preview domains, multiple properties (often not fully integrated) etc.
You can always paper over things with more redirects / params (so a landing that bounces on) or multiple clients but supporting multiple redirects can save the customer work in some scenarios.
You don't have to violate anything in RFC6749, so technically yes. As I said it's not widely adopted, but it's also not difficult to implement. Generally OAuth2 client libraries require you to enter the client_id which you get during registration. Instead just tell them to use the URI of their app. You can see an example of such instructions for my LastLogin project here: https://lastlogin.io/developers/. It's just a short paragraph.
I thought it was somewhat funny that all of the points highlighted in the "Attack #1: Big Head's credentials are exposed" section are exactly how Plaid works.
Yep, Plaid is put in a rough position of having to make something work across a bunch of financial institutions that can't or won't enable applications to have secure access. Plaid's existence and popularity is the canary that banks are way behind on features that users want.
"Most" is a strong phrase. Some financial institutions are taking it the opposite direction, like Fidelity which only allows a completely different partner now (Akoya).
Mint was the same way back in the day. I think Chase still doesn't support hardware tokens or authenticator apps. It's insane that banks are somehow the furthest behind in many security best practices.
I haven't really dove deep into oauth flow since 2015 or so until I started working on a little side project choosing to go with AWS Cognito, and the whole PKCE part of the flow was new to me. This explains them well.
Probably the best OAuth tutorial on the internet. Also it's amazing marketing, it just gets scarier every paragraph. By the end you're 100% put off implementing OAuth yourself haha
Man I wish I would have run across this when you first posted the blog. I just spent the last week learning all of this in pieces and by trial and error! Great writeup!
Some of the pictures stick out beyond the right side of the screen, and scrolling felt slightly sluggish(?), but aside from that I found the page to look pretty ok in Safari on iPhone.
It's really not just a marketing gimmick — it's the sole reason why we are building Stack Auth, instead of going for one of the existing proprietary managed auth solutions. Trust me, we wouldn't spend time building this if we didn't believe in the one thing that makes us different.
I'll quote myself from elsewhere:
> Both Zai and I care a lot about FOSS — we also believe that open-source business models work, and that most proprietary devtools will slowly but surely be replaced by open-source alternatives. Our monetization strategy is very similar to Supabase — build in the open, and then charge for hosting and support. Also, we reject any investors that don't commit to the same beliefs.
Fortunately, nowadays a lot of VCs understand this (including YC, who has a 10+ year history of investing in FOSS companies); we make sure that the others stay far away from our cap table.
They will eventually want you to have an exit. And there lies the problem. You should have gone for something like futo if you wanted the money. Right now there is an ideological dissonance in what you're doing. Same with supabase et al.
The only reason the real open source is what the world is built on, is because there is a guarantee that they will never need an exit and the community will always exist.
Nice introduction. One thing I missed though, is the introduction of the client secret in attack #6, which actually solves the problem if Piped Piper is exchanging the code for a token from its own server. PKCE is only strictly necessary if you cannot ensure that the client secret is not extracted, which could be the case if it's stored in a native app on for instance a smartphone.
Sounds like a bad implementation. There's nothing inherent to OAuth2 that makes it slow (thought the redirects to create a latency floor). If you want a good experience try logging in to my website https://takingnames.io/. Once you have an identity on LastLogin it's lightning fast.
1. Tens of kilobytes of JS that is executed exactly once, so is not amenable to JIT optimisation.
2. A strictly sequential series of operations with zero parallelism.
3. Separate flows for each access token, so apps with multiple APIs will have multiple such sequential flows. Thanks to JS being single-threaded, these will almost certainly run in sequence instead of in parallel.
4. Lazy IdPs that have their core infrastructure in only the US region, so international users eat 300ms per round trip.
5. More round-trips than necessary. Microsoft Entra especially uses both HTTP 1.1 and HTTP/2 instead of HTTP/3, TLS 1.2 at best, and uses about half a dozen distinct DNS domains for one login flow. E.g.: "aadcdn.msftauth.net", "login.live.com", "aadcdn.msftauthimages.net", "login.microsoftonline.com", and the web app URLs you're actually trying to access and then the separate API URLs because only SPA apps exist these days.
6. Heaven help you if you have some sort of enterprise system that the IdP needs to delegate to, such as your own internal MFA system, some Oracle Identity product, or whatever.
I've seen multi-minute login flows that literally cannot run faster than that, no matter what.
This is industry-wide. I stopped using chatgpt.com because it makes me re-authenticate daily (why!?) and it's soooooooo slow. AWS notoriously has its authentication infrastructure only in the US. Microsoft supports regional-local auth servers, but only one region, and the default used to be the US and can't be changed once set. Etc, etc...
(a list of things that are specifically bad implementations)
In my demos the OAuth flow completes so fast you can't even tell it happened, you don't even see the address bar change to the IdP the second time you do a flow when you already have a session there.
Are you in close physical proximity to your servers? Do you access your own application multiple times per day? Then you're testing an atypical scenario of unusually low network latency and pre-cached resources.
At scale, you can't put everything into one domain because of performance bottlenecks and deployment considerations. All of the big providers -- the ones actually used by the majority of users -- do this kind of thing.
This argument of "you're holding it wrong" doesn't convince me when practically every day I interact with Fortune 500 orgs and have to wait tens of seconds to a minute or more for the browser to stop bouncing around between multiple data centres scattered around the globe.
Big providers have more resources than anyone when it comes to having their servers close to users and optimizing performance. They can afford things like AnyCast networks and custom DNS servers for things like Geo routing. Just because they don't doesn't mean they can't.
> you can't put everything into one domain because of performance bottlenecks
If you look at my original comment in this thread, I mentioned that to log in to something like Microsoft 365 via Azure Entra ID, the browser has to connect to a bunch of distinct DNS domains. About half of these are CDNs serving the JavaScript, images, etc... For example, customers can upload their own corporate logos and wallpapers and that has to be served up.
Just about every aspect of a CDN is very different to an IdP server. A CDN is large volumes of static content, not-security-critical, slowly changing, etc... Conversely the API is security-critical, can't be securely served "from the edge", needs rapid software changes when vulnerabilities are found, etc...
So providers split them such that the bulk of the traffic goes to a CDN-only domain distributed out to cache boxes in third-party telco sites and the OAuth protocol goes to an application server hosted in a small number of secure data centres.
To the end user this means that now the browser needs at least two HTTPS connections, with DNS lookups (including CDN CNAME chasing!), TCP 3-way handshake, HTTPS protocol negotiation, etc...
This also can't be efficiently done as some sort of pre-flight thing in the browser either because it's all served from different domains and is IdP-controlled. If I click on some "myapp.com" and it redirects to "login.idp.com" then it's that page that tells the browser to go to "cdn.idp.com" to retrieve the JavaScript or whatever that's needed to process the login.
It's all sequential steps, each one of which bounces around the planet looking up DNS or whatnot.
"It's fast for me!" says the developer sitting in the same city as both their servers and the IdP, connected on gigabit fibre.
Try this flow from Australia and see how fast it is.
The only thing I'm really tempted to defend here is the multi-domain thing, because I'm not aware of another way to set cookies for multiple domains in a single flow, but maybe consolidate your services under a single domain like google does? Minus youtube.com of course, which is fair.
I'm still not entirely clear on if I should abandon implicit flow for a static site to get credentials. Easy enough to switch it to the CODE flow, but that gets you a refresh token with AWS and that feels a bit more risky to have on the client side.
There are a couple of ways to keep tokens on the client that prevent malicious code from accessing them.
- use HTTPOnly secure cookies. These will not be accessible to malicious JavaScript but can only be sent to servers on the same domain. Well, I guess if there was an exploit that let JS break the sandbox and access cookies, they could be accessible, but I think we can trust the browser vendors around this kind of security. This approach is widely supported, but does expose the token to physical exfiltration (that is, if someone on the browser opens up devtools, they can see the cookie with the token in it).
- store the tokens in memory. As far as I know, malicious JS code can't rummage around in memory. This works for SPAs, but does break if the user refreshes the page.
- bind the token to the client cryptographically using DPoP. This is a newish standard and isn't as widely accepted, but means you can store the token anywhere, since there's a signing operation tied to the browser.
All of these can work and have different tradeoffs.
I can't use http only cookies, because I have no server to set them. I'm using JavaScript on the client to do the post to get the tokens. Storing in memory is mostly fine, though I do want them in a cookie. May be able to skip that.
I will look into the dpop thing. A bit limited as I'm using AWS cognito.
I confess my thread model is such that I think I am fine with storing token in local storage. Would like to be using standard practice, though.
I'd be a liar if I said I had fully modeled things. I apologize for making it sound like I hadn't given it any thought.
I was hoping I had missed some updated guidance on how to manage tokens. Way too much of the documentation I was finding on OpenAPI and OAuth seemed to be aspirational and referencing things that hadn't come to be, yet. It has gotten rather frustrating.
Thanks for the link, it is a surprisingly fun topic to read on.
That's an interesting question, but what exactly is the threat model here? A rogue extension somehow reading the token? Is it stored anywhere? AFAIK generally the concern with access tokens is that they get gleaned from logs or MITM, not pulled out of the browser's memory.
I'm not sure we're talking about the same thing. Basically I'm saying that generally the URL bar and logs are considered more vulnerable than variables in memory.
However, I think I would trust a value that's been transmitted from memory more than one that's been stored in localStorage but not transmitted, because the latter is trivial to grab with an extension.
I'm not aware of any way for an extension to grab a variable from memory unless it knows how to access the variable itself from JS. This makes me wonder if there could be a security practice where you purposefully only store sensitive data like OAuth2 tokens in IIFEs in order to prevent extensions from accessing them. There's got to be some prior art on this.
Anyway, thanks for bringing up the question. It's been a useful thought experiment.
Makes sense and I agree there. The code flow has the advantage of not being in the url. I'm still unclear if I have a best practice on how to store the refresh token on a static website client. Feels off using local storage.
If you're in a situation where you would otherwise be using the implicit flow, can't you just throw the refresh token away? That should approximate the implicit flow.
For AWS Cognito's Token endpoint? Where is that documented? I'm also not seeing that behavior, as I don't have that in my scope and I do get a refresh token if I do the CODE flow.
I'm referring to the openid standard I guess[1], it looks like AWS Cognito does something different (they don't support offline_access, but they always issue a refresh_token from what I'm reading).
Right, but that seems preferable if I'm doing this all client side? Yes, an access token could get leaked in the URL. But sending the refresh token to the client feels far more dangerous.
I think that is the only way to do it if you use any openID connect implenmentations - which is often my criticism about it. Modern browsers can have cookies using SameSite=strict and HttpOnly cookies, which I would consider better than storing refresh tokens in localStorage or even expose them at all to javascript. But OpenID Connect predates those browser features, so everyone seems okay with it.
My scenario still makes it impossible to use those, sadly. Specifically, I do not have any servers involved. Completely static javascript served up through s3, basically.
If my threat model gets to where I care about this, I suppose my only real options is doing the redirect to a compute backed address. For now, glad I don't have to worry about it. :D
I had the same setup and was dissapointed with OAuth/OpenID Connect same as you, as there is NO way to make that work with modern flows. Your only option is to use the implicit flow and store refresh/access token in localStorage.
The "modern" code flow involves a piece of backend code that performs the exchange of the code for tokens - and you are actually not supposed to make those token available to the (insecure) browser.
Being fair, the code flow can be done on the client. And it does protect from most of the URL sniffing attacks that are there for the implicit flow. It just feels weird to still be doing all of that from the client.
I'd be genuinely interested how you can do the code flow on the client (browser). So far, all OAuth providers I tested will not let you do that due to CORS issues.
I hve just implemented this after we moved away from SuperTokens. My takeaway is that its easier than you'd think (there are libraries that do interaction with the SSO provider for you) and you can fine tune it to your liking (for example, more involved account linking).
If you're starting out though, probably go for a SaaS in the beginning. But be sure to have monitoring for pricing and an option to close account creation, these things can become expensive fast.
Many. We used the NodeJS Version of it, which has pretty poor error handling. When it breaks, it breaks hard (runtime errors with no message or stack trace)
Security. You can not deactivate certain unsave mechanisms. For example, if you send it an ID token, it will not verify the aid claim, allowing Anny valid token from the same SSO provider.
API stability. We're consuming their API from a mobile app. But every major version (about five a year) changed the REST API without backward compatibility or versioning. Its fine if you use their lib and keep parity, but that's really only possible on the web.
All of this was with their self hosted offering, I haven't tried their hosted one.
My opinion, as someone who works for a company with both a free and paid auth software option: it depends.
If you only need minimal auth functionality and you have one app, go with a built-in library (devise for rails, etc etc).
If you need other features:
- MFA
- other OAuth grants for API authentication
- SSO like SAML and OIDC
or you have more than one application, then the effort you put into using a SaaS service or standing up an independent identity server (depending on your needs and budget) is a better solution.
Worth acknowledging that auth is pretty sticky, so whatever solution you pick is one that you'll be using for a while (assuming the SaaS is successful).
Auth0 as a choice is good for some scenarios (their free plan covers 7k MAUs which is a lot for a hobby project), but understand the limits and consider alternatives. Here is a page from my employer with alternatives to consider: https://fusionauth.io/guides/auth0-alternatives
Stack Auth is trying to solve exactly this — open-source, developer-friendly, and reasonably priced managed auth. That way, you don't have to worry about OAuth but still aren't locked into a specific vendor.
The downside is that we only support Next.js for now (unless you're fine with using the REST API), but we're gonna change that soon.
Do regular people verify domains? It feels like the entire domain based trust model has been eroded by in-app browsers eliding the chrome for “simplicity” and even bank/visa official payment gateways that are redirecting you all over weird partner domains and microservice endpoints. Plus of course lack of education and mantras that mortals can follow at all times.
If users don’t verify domains, isn’t good old phishing more effective, like the $5 hammer in that xkcd?
Regular people may not but password managers do. Maybe not everyone, but at least security-conscious people would grow suspicious when their password manager doesn't autofill credentials on a page, and wouldn't immediately jump to manually entering it.
(SSO also reduces the attack surface of phishing, though of course then the attacker just has to phish the identity provider's credentials instead.)
I love the Silicon Valley references. One of the funniest shows ever made. I don't see it talked about much on HN, maybe it tends to hit a little too close to home?
We gotta start somewhere, and Next.js is popular right now — we're working on some cool non-Next stuff though (eg. auth proxies that can provide user info to any server, which we can hopefully launch within the next weeks).
Regardless, the focus of this blog post is on OAuth, not Stack Auth =) I appreciate the feedback though.
Yes, the blog post is very informative (although the images do not scale well on mobile).
My previous post was based on the phrase “open-source Auth0” (in your blog post) and we use Auth0 (and don’t like it). But all of out apps are react, not nextjs
> And why do most of these “Auth0” replacements only implement nextjs?
Probably because it's hard to support different languages. Even if all the replacements support OIDC (not a given) there are still subtle differences in implementation and integration.
That said, check out FusionAuth! We have over 25 quickstarts covering a variety of languages and scenarios. (I'm employed by them.)
Wheels should be reinvented once in a while, just to make sure there's not a better way to do it. They just shouldn't be reinvented by everyone all the time when there are perfectly good solutions available.
One gripe from Attack #3[0]:
> The solution is to require Pied Piper to register all possible redirect URIs first. Then, Hooli should refuse to redirect to any other domain.
There's actually a another (better IMO) solution to this problem, though to date it's rarely used. Instead of requiring client registration, you can simply let clients provide their own client_id, with the caveat that it has to be a URI, and a strict prefix of the redirect_uri. Then you can show it to the user so they know exactly where the token is being sent, but you also cut out a huge chunk of complexity from your system. I learned about it here[1].
[0]: https://stack-auth.com/blog/oauth-from-first-principles#atta...
[1]: https://aaronparecki.com/2018/07/07/7/oauth-for-the-open-web