• Use Cases
  • Pricing
  • Security
  • Docs
Sign InStart free

The outbound integration layer for SaaS products: emit once, then let Meshes handle routing, retries, fan-out, and delivery history.

© Copyright 2026 Meshes, Inc. All Rights Reserved.

About
  • About
  • Security
  • Blog
  • Contact
  • FAQ
Product
  • Pricing
  • Demo
  • Integrations
  • Guides
  • Changelog
  • Status
Compare
  • All comparisons
  • Build vs buy
  • vs Zapier
  • vs Make
  • vs n8n
  • vs Paragon
  • vs Merge
Use Cases
  • All use cases
  • Payment failed
  • User signup fan-out
  • Churn prevention
  • Trial expired events
  • Lesson completion flows
  • Page completion triggers
  • Page visit Intercom flows
Developers
  • Documentation
  • Agents
  • API Reference
  • MCP Server
  • llms.txt
Legal
  • Terms of Service
  • Privacy Policy
  • Acceptable Use Policy
  • Cookie Policy

OAuth Token Management for SaaS Integrations - The Patterns That Don't Break at 3am

OAuth tokens do not fail on schedule. They expire during jobs, race during refresh, and get revoked without warning. This guide covers the multi-tenant patterns that keep SaaS integrations running.

Cover Image for OAuth Token Management for SaaS Integrations - The Patterns That Don't Break at 3am

OAuth token management looks easy when you only have one connection.

You redirect the user to the provider, exchange the authorization code, store the access token and refresh token, and call the API. Maybe you even ship the first integration in a day or two.

Then production happens.

A token expires halfway through a delivery batch. Two workers refresh the same connection at the same time. A customer revokes access in the provider UI and your system keeps retrying with a dead credential until someone gets paged. One provider rotates refresh tokens. Another only returns one on the original grant. A third changes behavior depending on connected-app policy or consent mode.

This is why OAuth token handling becomes operational infrastructure long before it feels like it should.

For customer-facing integrations, a token store is not enough. You need a token lifecycle strategy.

The OAuth lifecycle in an integration system

At a high level, the flow is familiar:

  1. The customer authorizes your app.
  2. Your backend exchanges the authorization code for tokens.
  3. You use the access token to call the provider API.
  4. The access token expires.
  5. You use the refresh token to get a new access token.
  6. Eventually the connection is revoked, narrowed, or needs re-authorization.

That description is correct, but it hides what matters in production: the token is attached to a long-lived customer connection, not a one-off login session.

That means your system has to answer questions like:

  • Where do token records live?
  • How do you encrypt them?
  • How do you know when to refresh?
  • What happens if two workers refresh at once?
  • How do you distinguish a transient 500 from a permanent invalid_grant-style auth failure?
  • How do you stop a broken connection from creating an on-call storm?

Where production OAuth systems break

Most OAuth incidents are not caused by misunderstanding the protocol. They come from treating the protocol as if it were the whole problem.

Expiry during delivery

An access token can expire between queueing the work and actually performing the API call. If your jobs sit in a queue for a few minutes, "token was valid when enqueued" is meaningless by the time the worker runs.

The fix is simple in principle: validate freshness immediately before the provider call, not only at connection time.

Refresh race conditions

This is one of the most common failures in multi-worker systems.

Imagine five jobs for the same customer all start within a few seconds. Each worker sees an almost-expired token. Each worker independently decides to refresh it. Now you have five refresh attempts and five competing writes.

That creates multiple problems:

  • wasted provider calls
  • noisy logs and rate-limit pressure
  • last-write-wins token corruption
  • stale refresh-token persistence if one response rotates the refresh token and another writes an older version after it

OAuth token refresh needs single-flight behavior per connection. One worker refreshes. The others wait or reuse the new result.

Revoked or narrowed access

Customers disconnect integrations. Admins change scopes. Security teams revoke consent. Sometimes the provider invalidates the refresh token. Sometimes the token is still structurally present but no longer authorized for the API calls you need.

This is not a retry problem. It is a connection state problem.

Once the failure is clearly permanent, your system should stop retrying normal delivery attempts and move the connection into a re-auth-required state with a clear message for the customer.

Provider quirks

OAuth is a standard. Provider implementations are not.

Some providers rotate refresh tokens on refresh. Some only return a refresh token on the original authorization flow. Some expect "offline" access semantics. Some tie token longevity to app policy or consent mode. Some return 401 for expired access tokens and a different structured error for revoked refresh tokens.

If you build one generic refresh function and assume every provider behaves the same, that function will eventually wake you up at 3am.

Multi-tenant OAuth realities

The moment your customers connect their own SaaS tools, OAuth becomes a storage and isolation problem.

Each connection needs its own token state:

  • provider
  • workspace or tenant
  • provider account identifier
  • access token
  • refresh token
  • expiry time
  • scopes
  • last successful refresh timestamp
  • last auth failure
  • re-auth required state

One customer's refresh token should never be fetched using another customer's identifiers. One tenant's auth failures should not flood global retry queues. One customer's revoked consent should not block unrelated deliveries for other tenants.

The challenge is not only "how to call /token." It is how to manage thousands of long-lived OAuth connections safely.

If you are already dealing with the broader isolation side of integrations, the same architecture problem shows up in multi-tenant integration infrastructure.

Practical patterns that hold up in production

The best token systems are conservative. They make fewer assumptions, serialize the risky transitions, and keep enough state to explain what happened later.

1. Model the connection explicitly

Treat the token record like a durable connection object, not a loose blob of JSON in a table.

Useful fields usually include:

type OAuthConnection = {
  id: string;
  workspaceId: string;
  provider: 'hubspot' | 'salesforce' | 'google';
  providerAccountId: string;
  accessTokenCiphertext: string;
  refreshTokenCiphertext: string | null;
  scope: string[];
  expiresAt: string | null;
  refreshAfter: string | null;
  refreshedAt: string | null;
  lastRefreshError: string | null;
  reauthRequiredAt: string | null;
  version: number;
};

The exact shape varies, but two ideas matter:

  • keep enough metadata to make refresh decisions without guessing
  • keep explicit connection state like reauthRequiredAt instead of encoding everything in log messages

2. Refresh proactively, not only on 401

Waiting for the provider to reject the call is the easiest strategy to implement and one of the noisiest to operate.

If you know a token expires at 10:00:00, do not schedule work that first learns this at 10:00:03 during a customer-visible delivery attempt.

A safer pattern is to define a refresh buffer:

  • refresh when the token is within 5 to 10 minutes of expiry
  • refresh earlier for long-running jobs or large batches
  • keep a provider-specific override when a provider issues especially short-lived tokens

This turns refresh from an interrupt into regular maintenance.

3. Serialize refresh with a lock

The simplest way to avoid refresh stampedes is to use a row lock or distributed lock keyed by connection ID.

Here is a realistic Node.js pattern using a database transaction and FOR UPDATE:

async function getValidAccessToken(connectionId: string) {
  return db.transaction(async (tx) => {
    const connection = await tx.one<OAuthConnection>(
      `
        select *
        from oauth_connections
        where id = $1
        for update
      `,
      [connectionId],
    );

    if (connection.reauthRequiredAt) {
      throw new Error('Connection requires re-authorization');
    }

    const refreshDeadline = addMinutes(new Date(), 5);

    if (connection.expiresAt && new Date(connection.expiresAt) > refreshDeadline) {
      return decrypt(connection.accessTokenCiphertext);
    }

    const refreshed = await refreshWithProvider({
      provider: connection.provider,
      refreshToken: decrypt(connection.refreshTokenCiphertext!),
    });

    await tx.query(
      `
        update oauth_connections
        set
          access_token_ciphertext = $2,
          refresh_token_ciphertext = coalesce($3, refresh_token_ciphertext),
          expires_at = $4,
          refreshed_at = now(),
          last_refresh_error = null,
          version = version + 1
        where id = $1
      `,
      [
        connectionId,
        encrypt(refreshed.accessToken),
        refreshed.refreshToken ? encrypt(refreshed.refreshToken) : null,
        refreshed.expiresAt,
      ],
    );

    return refreshed.accessToken;
  });
}

Two details are doing real work here:

  • the connection row is locked while refresh happens
  • a missing refresh_token in the refresh response does not blindly erase the stored one

That second rule matters because providers differ. Some rotate refresh tokens. Some do not. Some only return them in particular circumstances.

4. Design for overlap windows

Refresh is not an instantaneous global state transition.

One worker may still have the old access token in memory while another has already stored the new one. If the old token has not expired yet, that is usually fine. Your system should tolerate a small overlap window instead of assuming every token read switches everywhere at once.

What you want to avoid is a more dangerous overlap: stale writes that replace a newer refresh token with an older one. Versioning or row locks protect you from that.

5. Add an auth-failure circuit breaker

Not every failure deserves another refresh attempt.

When a provider is down or timing out, retries make sense. When the error clearly means "this connection is no longer authorized," the correct action is usually:

  1. mark the connection as needing re-auth
  2. stop normal background retries for that connection
  3. surface the issue to the customer with a specific remediation path

Without this, one revoked connection can create endless refresh traffic and noisy delivery failures.

6. Encrypt tokens and scope every read

This should be the baseline:

  • encrypt tokens at rest
  • decrypt only in the code path that needs them
  • scope every lookup by tenant or workspace plus connection ID
  • keep audit visibility around administrative changes and connection state transitions

The security goal is not only "tokens are encrypted in the database." It is also "the wrong tenant context cannot fetch the wrong token."

Provider-specific gotchas worth planning for

Google: offline access and refresh-token loss conditions

Google's OAuth docs explicitly distinguish access to a live session from offline access. If you need a refresh token for background work, you have to request the right kind of access during the authorization flow.

Google also documents several cases where refresh tokens stop working, including user revocation and inactivity-related expiration conditions. If you treat refresh tokens as permanent until proven otherwise, you will eventually discover they are not.

That means your connection model should assume refresh token loss is normal, detectable, and recoverable through re-authorization.

HubSpot: short-lived access tokens and token response metadata

HubSpot's OAuth token responses include expiry metadata, and HubSpot's docs tell you to use the refresh token to obtain new access tokens as they expire. That makes it a good example of why "refresh only after a 401" is not a great operating model.

If the provider already tells you when the token expires, use that information to refresh before a queued delivery job turns expiry into a customer-facing failure.

Providers differ on refresh-token behavior

This is the broad rule underneath the provider-specific details: do not hard-code one refresh-token assumption across every integration.

Different providers vary on:

  • whether refresh tokens rotate
  • whether they are only issued on initial consent
  • what error shape signals revocation
  • what policies govern how long they remain valid

That is why OAuth token management becomes provider-aware infrastructure instead of a single generic helper.

If you are adding provider-specific integrations today, keep the implementation details in provider adapters or configuration, not scattered through delivery jobs.

When infrastructure helps

For a handful of internal integrations, you can keep token management inside your app and do it well.

For customer-facing SaaS integrations, the pain usually compounds when:

  • every customer has their own OAuth connection
  • background deliveries depend on long-lived tokens
  • you support multiple providers with different behaviors
  • support needs visibility into why a connection stopped working

That is the point where infrastructure earns its keep. Not because OAuth is impossible to implement, but because the combination of encrypted storage, refresh scheduling, tenant isolation, re-auth state, and delivery coordination becomes a platform concern.

Meshes can be part of that layer. Its published product model includes per-workspace connections and built-in OAuth token handling, which is useful when you want delivery and connection management to live in the same system. But the design advice in this post still applies even if you build everything yourself: model connection state explicitly, refresh proactively, serialize refreshes, and stop retrying dead credentials forever.

If you want to see how these connection problems fit into the bigger integration architecture, the most relevant companion piece is multi-tenant integration architecture. If you are working with specific providers, the current docs pages for HubSpot and Salesforce are the better place to look for provider-facing setup details.

The takeaway

OAuth token management fails in production for very predictable reasons:

  • tokens expire during background work
  • refreshes race
  • revocations look like generic failures until you model them properly
  • provider differences leak into your infrastructure whether you wanted them to or not

The teams that stay asleep at 3am are not the teams with the most OAuth code. They are the teams with the clearest token lifecycle model.

Treat tokens like long-lived integration state, not like session cookies with better marketing, and your system gets much easier to reason about.

Need OAuth connection management without building the whole control plane yourself? Join Meshes and keep token handling, delivery, and workspace-scoped connections in one system.