Skip to main content
TapTax
Engineering

Encrypting OAuth tokens at rest in a Node tax app (AES-256-GCM)

Why provider-level encryption at rest is not enough for OAuth tokens, and how to add an AES-256-GCM envelope with a zero-downtime backfill in Node and TypeScript.

Solomon Amos3 July 202610 min read

Encrypting OAuth tokens at rest in a Node tax app (AES-256-GCM)

By Solomon Amos, Founder of TapTax

When you integrate a third-party OAuth provider, the access and refresh tokens you store are not "just strings in a column." They are bearer credentials. Whoever holds a live access token can act as your user against that provider until it expires, and whoever holds the refresh token can keep minting new access tokens more or less indefinitely. For a UK tax app talking to HMRC's Making Tax Digital APIs, those tokens let software read and write a person's tax affairs. That raises the stakes from "leaked secret" to "leaked secret with regulatory and financial blast radius."

Most teams lean on one layer of defence here: the database provider encrypts data at rest. That is necessary and good, but it protects against one specific threat, which is someone walking off with the physical disk. It does nothing if an attacker gets read access to the live database, through a leaked connection string, an over-broad service role, a SQL injection, or a backup that ends up in the wrong bucket. In every one of those cases, provider-level "encryption at rest" hands the attacker plaintext, because the database decrypts transparently for anyone who can query it.

So the question this article answers is narrow and practical: how do you add a second, application-layer encryption envelope around OAuth tokens in a Node and TypeScript service, using AES-256-GCM, without taking downtime and without a flag-day migration of your existing rows? I will use the implementation pattern from TapTax, the MTD app I build, as the worked example. The code is real, but the pattern is provider-agnostic: it applies to any Node service holding Stripe, Google, GitHub, or HMRC tokens.

Why AES-256-GCM, and why an envelope

The first decision is the cipher. The defensible default in 2026 for symmetric encryption of small secrets is AES-256 in GCM (Galois/Counter Mode). GCM is an authenticated encryption mode, which matters more than people expect. It produces not just ciphertext but a 16-byte authentication tag, and on decryption it verifies that tag before giving you any plaintext back. The practical effect: if the ciphertext or the key is wrong, or a byte has been flipped in storage, decryption throws instead of silently returning garbage. You get tamper-evidence for free, which for a credential store is exactly what you want.

GCM has one rule you cannot break: never reuse a nonce with the same key. Nonce reuse in GCM is catastrophic, not merely weakening, so the nonce has to be unique per encryption. The standard answer is a fresh 12-byte (96-bit) random nonce for every single encrypt call, which is also the size NIST recommends for GCM. Twelve random bytes per token write is cheap, and the birthday-bound math is comfortable at the volume a per-user token store generates.

That gives the shape of the stored value. You do not store bare ciphertext, because to decrypt you also need the nonce and the tag, and you want to know which scheme and key version produced it. So you store a self-describing envelope:

v1:<nonce_b64>:<authTag_b64>:<ciphertext_b64>

The v1 prefix is doing real work. It is a version and key tag, so that the day you rotate keys or change the scheme, old envelopes are still identifiable and decryptable while new writes use the new scheme. It also gives you a cheap, unambiguous test for "is this value already encrypted?", which becomes the backbone of the zero-downtime migration later.

The crypto module

Here is the core, using only Node's built-in node:crypto. No third-party crypto dependency, which is itself a security posture: fewer moving parts in the part of the system you least want surprises in.


const ALGORITHM = 'aes-256-gcm';
const KEY_BYTES = 32;   // AES-256
const NONCE_BYTES = 12; // 96-bit nonce, the recommended size for GCM
const VERSION_TAG = 'v1';

function getKey(): Buffer {
  const raw = process.env.TOKEN_ENCRYPTION_KEY;
  if (!raw) {
    throw new Error('TOKEN_ENCRYPTION_KEY is not set.');
  }
  const key = Buffer.from(raw.trim(), 'hex');
  if (key.length !== KEY_BYTES) {
    throw new Error(`TOKEN_ENCRYPTION_KEY must be ${KEY_BYTES} bytes (${KEY_BYTES * 2} hex chars).`);
  }
  return key;
}

export function isEncrypted(value: string | null | undefined): boolean {
  if (!value) return false;
  if (!value.startsWith(`${VERSION_TAG}:`)) return false;
  const parts = value.split(':');
  return parts.length === 4 && parts.every((p) => p.length > 0);
}

export function encrypt(plaintext: string): string {
  if (isEncrypted(plaintext)) return plaintext; // idempotent: never double-encrypt
  const nonce = randomBytes(NONCE_BYTES);
  const cipher = createCipheriv(ALGORITHM, getKey(), nonce);
  const ciphertext = Buffer.concat([cipher.update(plaintext, 'utf8'), cipher.final()]);
  const authTag = cipher.getAuthTag();
  return [
    VERSION_TAG,
    nonce.toString('base64'),
    authTag.toString('base64'),
    ciphertext.toString('base64'),
  ].join(':');
}

export function decrypt(value: string): string {
  if (!isEncrypted(value)) return value; // legacy plaintext: pass through unchanged
  const [, nonceB64, tagB64, ctB64] = value.split(':');
  const decipher = createDecipheriv(ALGORITHM, getKey(), Buffer.from(nonceB64, 'base64'));
  decipher.setAuthTag(Buffer.from(tagB64, 'base64'));
  // final() throws if the auth tag does not verify: wrong key or tampered ciphertext.
  return Buffer.concat([decipher.update(Buffer.from(ctB64, 'base64')), decipher.final()]).toString('utf8');
}

Three details earn their place here.

First, the key is read lazily inside getKey(), not at module load. In a typical Node service the env is populated by dotenv (or your platform) before the first request but possibly after modules are imported. Reading the key on first use, not on import, avoids a class of "works locally, throws on boot in prod" surprises. The key itself is 64 hex characters decoding to 32 bytes, generated once with openssl rand -hex 32 and stored in your secret manager, never in the repo.

Second, getKey() throws if the key is missing or the wrong length. There is deliberately no fallback to storing plaintext. A silent fallback is how a security control quietly becomes a no-op six months after you ship it. Failing loud is the correct behaviour for a credential store.

Third, encrypt() is idempotent by way of isEncrypted(). Calling it on an already-encrypted envelope returns it untouched. That single property is what makes the migration below safe to run more than once, and safe to interleave with live writes.

One chokepoint, not scattered call sites

The encryption is only as good as your discipline in applying it. If twelve different modules write to the tokens table, you will eventually have a thirteenth that forgets to encrypt. The pattern that holds up is to funnel every read and write of the sensitive columns through one module, and encrypt and decrypt there, transparently.

function decryptRow(row: HmrcConnectionsRow): HmrcConnectionsRow {
  return { ...row, access_token: decrypt(row.access_token), refresh_token: decrypt(row.refresh_token) };
}

// On write (upsert / refresh): encrypt and stamp the scheme version.
await db.from('hmrc_connections').upsert({
  user_id: userId,
  access_token: encrypt(data.accessToken),
  refresh_token: encrypt(data.refreshToken),
  token_enc_version: 1,
  // ...
});

// On read: decrypt before returning, so callers get usable Bearer tokens
// and need no crypto awareness of their own.
return decryptRow(connection);

Callers (the middleware that attaches the Bearer token to an outbound HMRC call, the refresh routine, and so on) stay blissfully unaware that anything is encrypted. They ask for a connection and get plaintext tokens back. This is the single most important architectural decision in the whole exercise, because it converts "remember to encrypt everywhere" into "you physically cannot reach the column except through the place that encrypts."

The zero-downtime backfill

Now the part most write-ups skip. You did not start with encryption. You have existing plaintext rows in production, and you cannot take the service down to convert them. The version tag makes this tractable.

The migration is two additive steps plus a script, with no flag day:

  1. Add a token_enc_version column with a safe default. It records the scheme per row: 0 means legacy plaintext, 1 means AES-256-GCM v1.
ALTER TABLE public.hmrc_connections
  ADD COLUMN IF NOT EXISTS token_enc_version smallint NOT NULL DEFAULT 0;
  1. Deploy the code above. From this moment, every new write is encrypted and stamped version 1, and every read still works for old rows because decrypt() passes any non-envelope value straight through. New tokens are protected immediately. Old plaintext rows keep working. There is no window where reads break.

  2. Run a one-off, idempotent backfill that re-encrypts the rows still at version 0. Because encrypt() skips anything already in v1: form, the script is safe to run repeatedly, and safe to run while the app keeps writing.

for (const row of rows) {
  if (isEncrypted(row.access_token) && isEncrypted(row.refresh_token)) continue; // already done
  await client.query(
    'UPDATE hmrc_connections SET access_token = $1, refresh_token = $2, token_enc_version = 1 WHERE id = $3',
    [encrypt(row.access_token), encrypt(row.refresh_token), row.id],
  );
}

Run it with a --dry-run mode first that reports what it would change without writing, then for real. End the script with a verification query that counts rows by scheme and asserts zero remaining plaintext, so "the migration finished" is an observed fact, not a hope:

SELECT
  count(*)                                              AS total,
  count(*) FILTER (WHERE access_token  LIKE 'v1:%')     AS access_v1,
  count(*) FILTER (WHERE access_token  NOT LIKE 'v1:%') AS access_plaintext
FROM hmrc_connections;

The order is the whole trick: column first, code second, data last. Each step is independently safe, nothing is destructive, and at no point is there a moment where the running app cannot read a token it needs.

Key management is the rest of the job

Encryption moves the problem rather than removing it: the secret is no longer the tokens, it is the key. A few rules that have served well. Keep the key out of the repo and out of application config, in a real secret manager or platform secret store. Treat the key and the database as separate blast radii, so a database compromise alone yields only ciphertext, which is the entire point of doing this on top of provider-level encryption. Wire a startup self-check that does one encrypt and decrypt round-trip and refuses to boot if it fails, so a misconfigured or rotated key is caught at deploy, not at the first user's filing. And design for rotation now, even if you defer doing it: the v1 tag is your seam, because a future v2 key can encrypt all new writes while decrypt() still reads v1 envelopes, letting you migrate lazily on write or with the same backfill pattern.

Two layers (provider-level at-rest encryption plus an application-level AES-256-GCM envelope your database never sees the key for) is a meaningful jump in posture for a modest amount of code. If you are holding third-party OAuth tokens, especially for anything financial or regulated, the bar should be that a snapshot of your database, on its own, is useless to whoever takes it.

We file Making Tax Digital submissions to HMRC at TapTax, so this is the pattern behind how our MTD integration handles HMRC credentials, but none of it is specific to tax: swap HMRC for any OAuth provider and the design holds.


Solomon Amos is the founder of TapTax, a Making Tax Digital app for UK sole traders. He built TapTax's HMRC integration, spent two years embedded at HMRC's digital programmes, and holds a PhD in machine learning. linkedin.com/in/solomonudoh

Share:
securityencryptionOAuthnodejsengineering
SA

Solomon Amos

Solomon is a tax technology expert and the founder of TapTax. He writes plain-English guides on Making Tax Digital, HMRC compliance, and UK sole trader taxes - because everyone deserves to understand their own tax obligations.

You might also like