The Authentication Round-Trip: Cognito to WorkOS and Back Again
The Grass Is Always Greener
Every startup has a moment where someone on the team says “there’s a better way to do auth.” Ours came in July 2025, about four months into the build. We’d been running AWS Cognito since day one - it was part of the CDK stack, it worked, and we hated every minute of it.
Cognito’s developer experience is famously rough. The documentation reads like it was written by someone who considers clarity a security risk. The user pool configuration is a maze of flags where half the options are mutually exclusive but nothing tells you which half. And the login UI - the hosted UI that AWS provides as a “starting point” - looks like it was designed in 2014 by someone who’d heard of CSS but never met it personally.
So when we discovered WorkOS AuthKit, it felt like finding a fire exit. Social login, enterprise SSO, magic links, beautiful pre-built UI components - all out of the box. The migration commit on July 14th was one of those satisfying refactors where you delete more code than you add.
Two weeks later we had a WorkOS JWT authorizer sitting in front of our API Gateway. By late August, we’d built enhanced RBAC on top of it. The auth system looked clean. Professional, even.
Then we started doing the maths.
The Pricing Problem
Our platform is a multi-sided marketplace. We have developers who build integrations, partners who manage utility projects, internal staff who run operations, and end users who interact through a hub portal. That’s four portals, each with their own authentication requirements, all sharing the same identity layer.
The key characteristic of a utility marketplace is that most users are low-frequency. A homeowner might log in twice: once to accept an invitation, once to check a project status. A partner might log in a few times a month. Only internal staff and active developers are daily users.
WorkOS charges per monthly active user. For a B2B SaaS with 50 power users, that pricing model is fine. For a marketplace where you might have thousands of users who each log in twice a year, it’s a disaster. We were looking at per-MAU costs that would scale linearly with user acquisition - exactly the metric you want growing fastest.
There was a second concern, less about money and more about control. Authentication is the front door to everything. When your auth provider goes down, every portal goes dark. When they change their token format, every service that validates JWTs needs updating. When they deprecate an API, you’re on their timeline.
With Cognito, we own the user pool. It lives in our AWS account. The Lambda triggers are our code. The JWTs are standard Cognito tokens that API Gateway understands natively. It’s ugly, but it’s ours.
On September 28th, 2025 - roughly two months after the migration to WorkOS - we merged a commit titled “completely remove WorkOS authorizer and migrate to Cognito authentication.” The diff deleted the @workos-inc/node dependency, removed the WorkOS authorizer Lambda, cleared out all the WorkOS test infrastructure, and updated the OpenAPI security scheme description from “WorkOS JWT” to “Cognito JWT.”
We were back where we started. Except now we had to solve the problems that drove us away in the first place.
Act Two: The Cognito Auth Flow Gauntlet
Coming back to Cognito didn’t mean going back to Cognito’s defaults. The question wasn’t “should we use Cognito” - we’d settled that. The question was “which of Cognito’s seventeen auth flows should we actually use.”
If you’ve never explored Cognito’s auth flow options, imagine a restaurant menu where every dish is called something slightly different but the descriptions all blend together, and the waiter refuses to tell you which ones are actually available in your region.
Attempt 1: Custom Magic Links
The same day we removed WorkOS, we shipped magic link authentication. The idea was simple: user enters email, we send a link, they click it, they’re in. No passwords to remember, no friction.
The implementation was less simple. Cognito doesn’t have native magic links (at least, not the kind we wanted). So we built custom auth challenge Lambdas - a Define Auth Challenge trigger that orchestrates the flow, a Create Auth Challenge trigger that generates a token and sends the email via SES, and a Verify Auth Challenge trigger that validates the token when the user clicks the link.
User enters email
-> Cognito: DefineAuthChallenge (start custom flow)
-> Cognito: CreateAuthChallenge (generate token, send email via SES)
-> User clicks link
-> Cognito: VerifyAuthChallenge (validate token)
-> Tokens issued
It worked. For about a week.
The problem was email deliverability. Magic links live and die by whether the email actually reaches the inbox. And it turns out that transactional emails containing a URL that says “click here to log in” trigger every spam filter ever written. Gmail was fine. Outlook was mostly fine. Corporate email servers with aggressive filtering? Our magic links were landing in spam, getting blocked by link scanners, or never arriving at all.
For a marketplace where partner organisations run the gamut from tech startups to traditional utility companies (the kind that still use on-premise Exchange servers), this was a dealbreaker.
Attempt 2: EMAIL_OTP
On October 1st, I pivoted to EMAIL_OTP - Cognito’s native one-time password flow. Instead of a clickable link, we send an 8-digit code. User enters email, receives a code, types it in.
// The user pool config shifted to native OTP
enablePasswordAuth: false,
enableEmailOtp: true,
EMAIL_OTP solved the deliverability problem completely. A short email containing just a numeric code sails through spam filters. No suspicious links, no URL scanning, no corporate firewall blocking.
But it introduced a UX problem. Magic links are zero-effort: click and you’re in. OTP codes require the user to context-switch - open email, read code, go back to browser, type code. For users who are already annoyed about having to log in to check their utility project status, adding “now go check your email and type in this 8-digit code” was not winning us any fans.
And there was a second subtlety: not all users are the same. Internal staff log in daily - they want speed. Partners log in weekly - they want simplicity. Developers have technical expectations - they’d probably prefer a password. Hub users might log in twice ever - they just want it over with.
Attempt 3: The Three-State Flow
By March 2026, I’d landed on something more nuanced. Instead of picking one auth flow for everyone, we implemented a three-state login form that adapts based on context.
The key insight was using Cognito’s USER_AUTH flow with PREFERRED_CHALLENGE. Instead of the client dictating the auth mechanism, the server chooses the best challenge for each user based on their account type and history.
User enters email
-> InitiateAuth (USER_AUTH)
-> Cognito evaluates user attributes
-> Server returns PREFERRED_CHALLENGE
-> One of: PASSWORD, EMAIL_OTP, or both
The login form became a state machine with three states:
- Email entry - user provides their email
- Password - if the server determines this user should authenticate with a password (staff, developers with existing passwords)
- OTP - if the server determines this user should get a one-time code (new users, hub users, passwordless-preferred accounts)
For the hub portal, we eventually settled on password-first with OTP as a secondary option. After a user accepts an invitation and sets their password, they get a conventional login flow - but with a “sign in with code instead” link for when they’ve forgotten their password. No password reset flow needed; the OTP serves double duty.
// The login form handles the server's challenge response
const result = await signIn({ username: email, options: { authFlowType: 'USER_AUTH' } });
if (result.nextStep.signInStep === 'CONFIRM_SIGN_IN_WITH_PASSWORD') {
setAuthState('password');
} else if (result.nextStep.signInStep === 'CONFIRM_SIGN_IN_WITH_EMAIL_CODE') {
setAuthState('otp');
}
Multi-Portal, Shared Identity
The auth architecture that emerged has a shape we didn’t plan but ended up liking. Two Cognito user pools serve the entire platform:
| User Pool | Portals | Auth Flow | Why |
|---|---|---|---|
| Customer | Hub, Developer Portal, Partner Portal | USER_AUTH + PREFERRED_CHALLENGE | Different user types need different flows; server decides |
| Staff | Admin Portal, Ops Portal | Password + MFA | Staff are daily users with elevated permissions |
The customer user pool uses groups to distinguish between account types - internal, partner, developer - and pre-token-generation Lambda triggers to inject custom claims into the JWT. API Gateway uses a native Cognito authorizer, which means no custom Lambda authorizer sitting in the request path (one of the things we’d built for WorkOS and were happy to delete).
The staff user pool is simpler: password authentication with enforced MFA, separate custom domain, separate hosted UI. Staff users are pre-provisioned - there’s no self-registration flow.
Both pools share the same base CDK construct, which handles the common infrastructure - SES email configuration, custom domains, OAuth2 scopes, resource servers. The differentiation happens in the concrete implementations:
// Customer pool: passwordless, OTP-first
enablePasswordAuth: false,
enableEmailOtp: true,
// Staff pool: password-required, MFA-enforced
enablePasswordAuth: true,
enableEmailOtp: false,
What We Learned
Auth provider selection is a business model decision, not a technical one. WorkOS is a genuinely good product. If we were building a B2B SaaS with a predictable user count, we’d probably still be using it. The mismatch wasn’t about quality - it was about pricing topology. Per-MAU pricing and marketplace user patterns are fundamentally incompatible.
Own your front door. Third-party auth is a dependency that sits in the critical path of every user interaction. The convenience of managed auth UI and social login shortcuts has to be weighed against the fact that you’re handing someone else the keys to your entire user experience. For a startup that’s still figuring out its auth flows - and we iterated through three in six months - that dependency makes experimentation expensive.
One auth flow does not fit all. The biggest mistake in our initial Cognito setup was treating auth as a single problem. It’s not. “How should a staff member who logs in 20 times a week authenticate?” and “How should a homeowner who logs in twice a year authenticate?” have fundamentally different answers. USER_AUTH with PREFERRED_CHALLENGE lets the server make that call per-user, which turned out to be exactly the abstraction we needed.
Email deliverability is a feature, not an implementation detail. Magic links are the superior UX right up until the email doesn’t arrive. If your user base includes organisations with aggressive email filtering (and in the utility sector, they all do), plan for it. OTP codes are uglier but more reliable. The best solution lets you offer both.
The round-trip cost us about two months. If we’d stayed on WorkOS, we’d have a cleaner auth UI and a growing invoice. Instead, we have a Cognito setup that’s ugly in the CDK code but exactly right for how our users actually authenticate. Sometimes the grass really is greener on the side you already mowed.
We Left Our Auth Provider for a Prettier One. We Came Back.
The Grass Is Always Greener
Every startup eventually has the moment where someone says “there’s a better way to do login.” Ours came in July 2025, four months into the build.
We’d been using AWS Cognito for user authentication since day one. It worked. We also hated it. Cognito’s documentation reads like it was written by someone who considers clarity a security risk. The configuration is a maze of settings where half the options conflict with each other, and the official login page it provides looks like a 2014 government website.
So when we discovered WorkOS AuthKit — which offered social login, magic links, and beautiful pre-built login screens — it felt like finding a fire exit. We migrated in July. By late August we had a clean, modern auth system. It looked professional.
Then we started doing the maths.
The Problem With Pretty
Our platform is a marketplace connecting multiple different types of users. We have partners, developers, internal staff, and end users — each group with their own portal and their own pattern of usage.
The crucial detail: most of our users log in infrequently. A homeowner might log in twice a year. A partner might log in a few times a month. Only our internal staff are daily users.
WorkOS charges based on how many users log in each month. For a typical business software product with 50 regular users, that pricing is perfectly reasonable. For a marketplace where you might have thousands of users who each log in twice a year, it becomes a problem. Our costs would grow in direct proportion to user acquisition — which is the one metric you want growing fastest.
There was a second concern beyond cost. Authentication is the front door to everything. When your auth provider goes down, every portal goes dark. When they change how their login tokens work, every service that reads those tokens needs updating. We were handing someone else significant control over our users’ experience.
On September 28th — two months after the migration — we merged a commit called “completely remove WorkOS authorizer and migrate to Cognito authentication.”
We were back where we started. Now we had to fix the things that drove us away in the first place.
Round Two: Which Bit of Cognito?
Coming back to Cognito didn’t mean accepting the default setup. Cognito offers roughly seventeen different ways to handle login, with names that all sound similar and documentation that doesn’t clearly explain the differences. Choosing between them felt like reading a restaurant menu where every dish has a slightly different name but the descriptions blend together.
Attempt One: Magic Links
The same day we removed WorkOS, we shipped magic link authentication. The idea: user types their email, we send a link, they click it, they’re logged in. No passwords to forget.
Cognito doesn’t have this built in, so we built it ourselves using custom code that hooks into Cognito’s login process. We generate a token, email it as a link, and verify it when clicked.
It worked. For about a week.
The problem was email deliverability. Magic links contain a clickable URL that says “log in here.” Corporate email servers — especially the kind still running older on-premise systems — treat those with suspicion. Our links were landing in spam folders, getting blocked by security filters, or occasionally never arriving at all.
For a marketplace serving utility companies, which skew toward traditional IT infrastructure, this was a dealbreaker.
Attempt Two: One-Time Codes
In early October, we switched to email one-time codes: instead of a clickable link, we send a short numeric code. User enters their email, gets a code, types it in.
Deliverability: solved. A short email containing six to eight digits sails past every spam filter we encountered. No suspicious links, nothing for a security scanner to flag.
User experience: worse. Magic links are effortless — click and you’re in. One-time codes require the user to context-switch, open a second window, read a number, go back to the browser, and type it. For users who are already slightly reluctant to log in at all, this is friction.
We also noticed that different users had very different needs. Staff log in twenty times a week — they care about speed. Partners log in occasionally — they care about simplicity. A homeowner logging in twice a year just wants to get it over with as quickly as possible.
One login method can’t be the best answer for all of these.
Attempt Three: Let the Server Decide
By March 2026, we’d landed on something more sophisticated. Instead of choosing one login flow and applying it to everyone, we built a login screen with three modes that adapts based on who’s logging in.
The key insight was using a Cognito feature that lets the server choose the best login method for each user rather than the client dictating it. When you enter your email, the server evaluates what kind of user you are and tells the app which challenge to present: a password, a code, or both.
The login screen handles whichever challenge the server sends back. For staff with passwords, it shows a password field. For hub users and new accounts, it sends a code. Users who’ve forgotten their password can request a code instead — which means we don’t need a separate “forgot password” flow at all.
The Architecture That Emerged
What we ended up with wasn’t something we’d planned at the start — it emerged through the iterations.
Two separate user pools handle the entire platform:
| Who | What they see | How they log in |
|---|---|---|
| Partners, developers, hub users | Customer portals | Server chooses: password or code, based on account type |
| Internal staff | Admin and ops portals | Password required, with two-factor authentication |
The customer pool uses a single mechanism to distinguish different user types and inject the relevant information into the login token automatically. This means our API layer doesn’t need special handling per user type — it reads the token and gets back exactly the context it needs.
The staff pool is simpler and stricter: proper passwords, mandatory second factor, no self-registration. Every staff account is created by an admin.
What We Learned
Auth provider selection is a business model decision, not a technical one. WorkOS is a genuinely good product. The problem wasn’t their quality — it was the mismatch between per-login-user pricing and a marketplace where most users log in rarely. Had we thought through the pricing model against our expected user behaviour before migrating, we’d have saved two months.
Owning your front door matters. When a third-party handles authentication, they sit in the critical path of every single user interaction. Migrating away from them — if and when you need to — is expensive and disruptive. For a startup still figuring out its auth model, that dependency makes iteration costly.
One login flow doesn’t fit everyone. “How should someone who logs in twenty times a week authenticate?” and “How should someone who logs in twice a year authenticate?” have different answers. The solution that let the server decide per-user, rather than forcing everyone through the same mechanism, was exactly the right abstraction.
Email reliability is a feature. Magic links are the superior experience right up until the email doesn’t arrive. If any of your users work in organisations with aggressive email security — and in regulated industries, that’s most of them — plan for it. Codes are less elegant but more reliable.
The round-trip cost us about two months. We came back to something uglier under the hood but exactly right for how our users actually behave. Sometimes the grass is greenest on the side you already mowed.
← Back to posts