Credit Debit — Scalable Design Plan · Billing

One way to spend credits, everywhere. Built to last past the Month-1 launch.

Status: proposal — 2026-04-18 Owner: Dipanshu Target: launch-ready in 7 days, supports 10k+ tenants without rewrite

1. Why this exists

Today, three services (blog, content, brain) each re-implement credit debits with subtly different SQL, different error codes, and different failure modes. Newsletter and communication don't debit at all. The manager exposes /billing/internal/credits/debit that nobody calls. This split-brain is why credits drift from reality.

Single invariant we must uphold forever:

For every tenant, at any instant: balance == sum(credit_transactions.amount). No race, no retry, no crash, no streaming timeout may violate this.

Everything below serves that invariant.

2. Three consumption patterns (all we will ever need)

Pattern	Example	Shape
Fixed charge	blog publish, content schedule, email send	one atomic debit before the action; compensating refund on downstream failure
Hold + capture/void	AI chat streaming, content generation	pre-authorize max cost → run → capture actual, refund difference; or void on error
Bulk metered	newsletter blast (10 000 sends)	loop of fixed charges, halts on first insufficient-balance

Every future feature will map to one of these. We will not add a fourth pattern without consensus.

3. Public API

New package: @repo/core-billing/ledger.

import { CreditLedger } from "@repo/core-billing"
 
const ledger = new CreditLedger(db)   // thin wrapper around a Drizzle client
 
// ─── Pattern 1: Fixed charge ───────────────────────────────────────
await ledger.charge({
  tenantId,
  amount: CREDIT_COSTS.BLOG_POST_PUBLISH,
  reason: "blog.post.publish",
  description: `Published post ${postId}`,
  referenceId: postId,          // external ref for reconciliation
  idempotencyKey: `publish:${postId}`,   // dedup same request
})
// → { txId, balance } | throws InsufficientCreditsError | PlanInactiveError
 
// ─── Pattern 2: Hold / capture / void ──────────────────────────────
const hold = await ledger.hold({
  tenantId,
  maxAmount: 50,                 // generous upper bound
  reason: "ai.chat",
  ttlSeconds: 300,               // cron voids if not settled
  idempotencyKey: `chat:${messageId}`,
})
 
try {
  const { tokens } = await runStream()
  await ledger.capture({
    holdId: hold.holdId,
    finalAmount: Math.ceil(tokens / 1000),
    description: `${tokens} tokens`,
  })
} catch (err) {
  await ledger.void(hold.holdId)
  throw err
}
 
// ─── Pattern 3: Bulk ───────────────────────────────────────────────
for (const recipient of batch) {
  try {
    await ledger.charge({
      tenantId,
      amount: CREDIT_COSTS.EMAIL_SEND,
      reason: "email.send",
      referenceId: recipient.id,
      idempotencyKey: `send:${campaignId}:${recipient.id}`,
    })
    await actuallySend(recipient)
  } catch (err) {
    if (err instanceof InsufficientCreditsError) {
      markCampaignHalted(campaignId)
      break
    }
    await ledger.refund({ idempotencyKey: `send:${campaignId}:${recipient.id}` })
    markFailed(recipient.id)
  }
}
 
// ─── Compensating refund ───────────────────────────────────────────
await ledger.refund({ txId })                 // by transaction
await ledger.refund({ idempotencyKey: ... })  // by key (for retries)
 
// ─── Read-only ─────────────────────────────────────────────────────
await ledger.balance(tenantId)

Error model (all typed, all serializable)

class InsufficientCreditsError  { code: "INSUFFICIENT_CREDITS"; required, balance }
class PlanInactiveError         { code: "PLAN_INACTIVE"; status }  // canceled/past_due
class HoldNotFoundError         { code: "HOLD_NOT_FOUND" }
class HoldExpiredError          { code: "HOLD_EXPIRED" }
class IdempotencyConflictError  { code: "IDEMPOTENCY_CONFLICT" }  // same key, diff body

Every service surfaces these as HTTP 402 (INSUFFICIENT_*) or 403 (PLAN_INACTIVE). No more "Insufficient" string matching.

4. Storage model

Schema changes (additive, no breaking migrations)

-- credit_transactions gets hold fields (nullable for non-hold rows)
ALTER TABLE credit_transactions
  ADD COLUMN tx_status text NOT NULL DEFAULT 'settled',   -- settled|held|voided|refunded
  ADD COLUMN hold_id text,                                 -- FK back to parent hold tx
  ADD COLUMN hold_expires_at timestamptz,
  ADD COLUMN idempotency_key text;
 
-- Dedup enforcement: one debit per (tenant, key)
CREATE UNIQUE INDEX credit_tx_idempotency_idx
  ON credit_transactions (tenant_id, idempotency_key)
  WHERE idempotency_key IS NOT NULL;
 
-- Cron scans expired holds
CREATE INDEX credit_tx_hold_expiry_idx
  ON credit_transactions (hold_expires_at)
  WHERE tx_status = 'held';

Why one table, not separate `credit_holds`

Simpler queries, one audit source, every row is a real ledger entry. A hold is just a debit with tx_status='held'; capture flips to settled with a true-up delta row; void flips to voided with a refund row. The sum invariant still holds.

5. Atomic primitives

All three patterns compose from two SQL primitives executed via neon-http single-statement CAS — no transactions (not available on neon-http), no SELECT FOR UPDATE (contention).

Primitive A — conditional debit:

UPDATE tenant_credits
   SET balance = balance - $1, updated_at = now()
 WHERE tenant_id = $2
   AND balance >= $1
RETURNING balance;

0 rows → InsufficientCreditsError.

Primitive B — unconditional delta (for refunds, captures-within-hold):

UPDATE tenant_credits
   SET balance = balance + $1, updated_at = now()
 WHERE tenant_id = $2
RETURNING balance;

Everything else is log lines (credit_transactions inserts) and state flips. No other SQL shapes allowed in the ledger package.

6. Failure modes and mitigations

Scenario	What happens	Mitigation
Debit succeeds, caller crashes before doing the work	Ledger decremented, work undone	Caller catches, calls `refund(txId)` — idempotent
Same HTTP request retried by client	Two debit attempts	`idempotency_key` unique index → second returns cached tx, no double-debit
Razorpay webhook delivered twice	Double credit-pack credit	Use `processed_payment_events` and `idempotency_key = "rzp:${eventId}"`
Streaming response aborts mid-flight	Hold stranded	Cron sweeps `tx_status='held' AND hold_expires_at < now()`, runs `void()`
Concurrent debits at exact balance boundary	Race	Primitive A's `WHERE balance >= $1` → at most one wins, rest get 402
DB temporarily unavailable	Request fails	Return 503, no partial state; caller retries with same idempotency_key
Plan canceled while user mid-action	Should block new spend	Ledger checks `subscriptions.status` first; rejects with `PlanInactiveError`
System tenant (OTP emails, onboarding)	Shouldn't pay	Reserved tenant IDs bypass ledger: `SYSTEM_TENANT_IDS = new Set(["system"])`
Admin comps a user	Need audit trail	`ledger.adjust({ tenantId, amount, reason: "admin.adjustment", actor })` — separate method, logs actor

7. Observability (non-negotiable)

Every call to the ledger emits exactly one structured log entry:

{
  "event": "credit.tx",
  "tenant_id": "ls_abc",
  "op": "charge|hold|capture|void|refund|adjust",
  "reason": "blog.post.publish",
  "amount": -10,
  "balance_after": 490,
  "tx_id": "ct_...",
  "idempotency_key": "publish:post_xyz",
  "reference_id": "post_xyz",
  "latency_ms": 12
}

Cloudflare Workers logs → Logpush → analytics destination of choice.
Build one dashboard card: sum(credit_transactions.amount) GROUP BY tenant_id vs tenant_credits.balance — any drift ≥ 1 credit pages oncall.
Nightly reconciliation job re-computes balance from tx log and alerts on delta.

8. Scale & performance budget

Metric	Target at launch	Target at 10k tenants	Mitigation when hit
Median debit latency	< 30 ms	< 50 ms	already fine — one UPDATE + one INSERT
p99 debit latency	< 150 ms	< 250 ms	cache `subscription.status` in KV (5 s TTL) to skip a SELECT
Concurrent debits per tenant	10/s	100/s for mega-tenants	row-level lock on `tenant_credits` row is enough; add sharded balance only if one tenant > 1000/s
`credit_transactions` growth	~100k rows/day	~10M rows/month	partition by `created_at` month after 100M rows
Balance read frequency	every UI load	dashboards + API polling	cache in Workers KV with 5 s TTL; writes invalidate via tag

We do not pre-optimize. We measure at 1k tenants and revisit.

9. Security & abuse

No client-side amount parameter — ever. Every charge reads cost from CREDIT_COSTS. Route handlers pick the constant, not the user.
Rate limit per tenant per reason at the gateway (already exists via x-gateway-key path). Cap ai.chat at 60 req/min/tenant free-tier, 600 pro.
Max hold amount capped per reason; no maxAmount: 100_000 for a chat call.
Refund RBAC: admin adjustments require system:owner permission; internal services cannot call adjust().

10. Rollout plan (7 days)

Day	Task	Risk
1	Write `@repo/core-billing/ledger` package + `creditGuard` middleware + unit tests. No call sites touched.	Low
2	Schema migration (additive). Run in staging, verify zero impact on live writes.	Low
3	Drop-in replace blog-service/posts.ts publish with `ledger.charge` — the existing test suite proves equivalence.	Low
4	Migrate content-engine/mcp.route.ts (fixed charge) and brain-service/chat.route.ts (hold/capture). Brain customer-chat gets a hold too.	Medium — streaming path
5	Wire newsletter + communication (bulk pattern). Emit metrics.	Low
6	Add manager `scheduled()` cron: void expired holds, apply `pendingPlanId` downgrades, renew recurring addons, send trial-ending notices.	Medium
7	End-to-end sandbox: signup → trial → spend → renew → upgrade → downgrade → cancel. Tag bugs.	—

Rollback plan: each service migration is a separate PR behind a USE_LEDGER=true env flag. Flip back per-service if any regresses.

10a. Onboarding a new service in ≤ 5 minutes

Two paths — pick by pattern.

Path A — Fixed charge (use the middleware, 1 line)

For any route where "one action = one debit" (the 80% case: create order, generate report, export file), wrap the handler:

import { creditGuard } from "@repo/core-billing"
 
app.post("/orders",
  creditGuard({ cost: "STORE_ORDER_PROCESS", reason: "store.order.process" }),
  async (c) => {
    // handler runs only if debit succeeded; if handler throws,
    // middleware auto-refunds before returning the error to the caller.
  }
)

The middleware handles: reading cost from registry, atomic debit, idempotency key (defaults to request ID), refund-on-failure, plan-status gate, structured log line, 402 response on insufficient credits. A new developer cannot forget a step because there are no steps to remember.

Onboarding checklist for a new service wanting fixed charges:

Add cost: one line in packages/core-database/src/costs.ts
Add reason: one line in packages/core-billing/src/constants.ts
Wrap route: one line via creditGuard(...)

That's it. No ledger import, no try/catch, no SQL.

Path B — Streaming hold or bulk metered (use the ledger imperatively)

Middleware can't wrap SSE generators or per-iteration loops. For those, import CreditLedger and use hold/capture/void or a charge loop. The API in §3 is identical across services; one mental model.

What this means for scale

Every new service follows the same onboarding doc. No internal wiki page. No "ask Dipanshu how billing works." The compiler enforces typed reasons; the middleware enforces the failure pattern; the cron enforces hold cleanup. A junior dev shipping a new feature cannot accidentally:

skip the debit
double-debit on retry
debit without refunding on error
leak a hold
use a freeform reason string that breaks analytics

These invariants are properties of the package, not properties of the developer's discipline.

11. What we explicitly say "no" to

No saga / orchestrator. Compensating refund + idempotency is sufficient for our shapes.
No event-sourced ledger. tenant_credits materialized balance stays; credit_transactions is the immutable log. Best of both.
No cross-tenant debits (user-initiated transfers between workspaces). Out of scope.
No fractional credits. integer column. One credit = smallest billable unit. Round up at true-up.
No per-seat credit allocation. Credits are workspace-scoped. Roles control who can spend, not how much.
No retiring CREDIT_COSTS. It is the authoritative price list — all costs live in packages/core-database/src/costs.ts.

12. Open questions (answer before Day 1)

Plan-status gate strictness. Should a past_due tenant still debit (to give UX runway) or hard-block? Recommendation: hard-block after 3-day grace.
Trial-period credit budget. During trial (no charge yet), do we dispense baseCredits upfront or a smaller trial allotment? Recommendation: dispense 20% of baseCredits at trial start; rest on first successful charge.
Refund window. Do we let users refund an action they performed? Recommendation: no user-facing refund API; only compensating refunds from services + admin adjustments.
Email-send cost = 1 credit → is that right? At 99 900 paise (Starter = ₹999) and 5 000 baseCredits, each credit is ₹0.20. One email at 1 credit ≈ ₹0.20 — fine margin over Resend cost. Confirm with finance.

Decisions land in this doc before code. Changes require a PR that updates this file alongside the ledger.

13. Appendix — why the current patterns don't scale

Current	Problem at 1k tenants	Problem at 10k
Each service writes its own SQL	3 bug surfaces to patch for every new policy	Unmaintainable
Fire-and-forget `creditTransactions` insert (brain, content)	Some tx log rows silently missing → drift alerts flap	Finance cannot reconcile
No idempotency	SSE reconnect / Razorpay retry double-debits	Support tickets compound
Manager `/internal/credits/debit` RPC	Adds ~50 ms cross-Worker latency to hot path	p99 budget blown
No hold pattern for LLM	Brain's true-up can go negative	Free LLM spend = bleed

The ledger package ends all five categories of problem in one weekend of work.