logicspike/docs

Architecture

Database Strategy & Schema Architecture

Last Updated: 2026-02-28

1. High-Level Strategy: "Building vs Floor"

To scale LogicSpike as a multi-tenant platform, we categorize data into two levels.

🏢 Level 1: The Building (Public / Shared)

Data that exists outside any specific tenant. This is the Platform Infrastructure.

  • Database Location: public schema (default).
  • Access Pattern: Highly read-heavy, global integrity.
  • Examples:
    • Who are the users? (users)
    • Which tenants exist? (tenants)
    • Who owns what? (memberships)
    • What services are available? (services - e.g., Blog, Store)

🚪 Level 2: The Floor (Tenant Private)

Data that belongs exclusively to one tenant.

  • Database Location: Shared Database, but logically isolated via tenant_id (Discriminator Column).
  • Access Pattern: Strictly scoped. Every query MUST have WHERE tenant_id = ?.
  • Examples:
    • Blog Posts (posts)
    • Products (products)
    • Customers (customers)
    • API Keys (api_keys)
    • Media Files (media_files)

3. Deep Dive: Tenancy Model Comparison

We evaluated two primary patterns for LogicSpike.

Option A: Shared Schema (Selected)

All tenants share the same tables. Rows are distinguished by a tenant_id column.

Option B: Schema-per-Tenant

Each tenant has their own Postgres Schema (e.g., tenant_101.posts, tenant_102.posts).

The Comparison Matrix

Feature Shared Schema (Option A) Schema-per-Tenant (Option B)
Simplicity 🟢 High. Standard SQL. ORMs like Prism/Drizzle handle it easily. 🔴 Low. Need complex routing logic to switch schemas per request.
Migrations 🟢 Fast. Run ALTER TABLE once. 🔴 Slow & Risky. Must run ALTER TABLE 10,000 times for 10,000 tenants. If #500 fails, you have a split-brain state.
Connection Pooling 🟢 Efficient. One pool serves all. 🟠 Complex. Apps like PgBouncer struggle with thousands of schema search paths.
Data Isolation 🟠 Logical. Relies on WHERE tenant_id = ?. Bug in code = Data Leak. 🟢 Physical. Database engine enforces separation. Harder to leak data.
Performance (Small Scale) 🟢 Fast. 🟢 Fast.
Performance (Massive Scale) 🟠 Index Bloat. Indexes get huge. Needs Partitioning eventually. 🟢 Scalable. Each tenant's index is small.
Cross-Tenant Analytics 🟢 Easy. SELECT count(*) FROM posts. 🔴 Hard. Need an ETL pipeline to aggregate data from 10k schemas.
Cost 🟢 Low. Standard instances. 🟠 Medium. Higher RAM usage for schema metadata.
"Noisy Neighbor" 🔴 Risk. One huge tenant can slow down the DB for everyone. 🟠 Risk. Still shares CPU/IO, but indexes are separate.

Why we chose Shared Schema

  1. Operational Sanity: Managing migrations for 1000s of schemas is a DevOps nightmare for a startup team.
  2. Tooling Support: Drizzle ORM and Next.js work flawlessly with Shared Schema. Schema switching often requires hacks or complex middleware.
  3. Analytics: As an eCommerce/Blog platform, you will want to ask "How many posts were created today across the platform?". Shared schema makes this a 1ms query.

Mitigation for Shared Schema Risks

  • Row Level Security (RLS): We can enable Postgres RLS to enforce tenant_id checks at the database engine level, mimicking the isolation safety of Schema-per-Tenant.
  • Partitioning: When we hit 100M rows, we can partition the posts table by tenant_id.

4. The Enterprise Exception: Hybrid Tenancy (Handling "Astha")

You asked: "What if a large company (Astha) wants their own separate database?"

We handle this via a Hybrid Pattern.

The Strategy

99% of users stay in the Shared Pool (Public Cloud). 1% of users (Enterprise) get a Dedicated Database (Private Cloud).

Implementation

We modify the tenants table (Level 1) to include a "Router" config.

// Table: tenants
{
  id: "tenant_astha_corp",
  name: "Astha Corp",
  plan: "enterprise",
  // The magic field
  databaseConfig: {
     type: "isolated", 
     connectionUrl: "postgres://user:pass@sensitive-db-cluster.aws.com/astha_db"
  } 
}
// Table: tenants (General User)
{
  id: "tenant_small_shop",
  name: "Small Shop",
  plan: "basic",
  databaseConfig: null // Implies: Use default shared pool
}

Application Logic (The "Router")

  1. Request comes in: astha.logicspike.com
  2. Lookup: Middleware checks tenants table for id: "tenant_astha_corp".
  3. Decision:
    • If databaseConfig is present -> Create/Get connection to Dedicated DB.
    • If databaseConfig is null -> Use Shared DB Pool.
  4. Execute: Run the query SELECT * FROM posts.

Pros & Cons of Hybrid

  • High Revenue: We can charge Astha 10x-100x more for this isolation.
  • Performance: Astha's load doesn't kill the shared pool.
  • ⚠️ Ops Complexity: Schema migrations must now run on the Shared DB AND Astha's DB. We need a strict CI/CD pipeline to ensure they don't drift out of sync.


5. Billing Strategy: "Bundle vs A La Carte"

We support both predefined plans and custom service selection (e.g., "Just the Blog").

The Model

  1. Plans (The Menu): Standard bundles (e.g., "Pro").
  2. Custom Selection: Users can pick individual services.
  3. Tenant Services (Entitlement): The single source of truth for access.

Logic Flow

  1. Standard Plan: User buys "Pro". System copies "Pro" features into tenant_services.
  2. Custom: User selects "Blog" + "Chatbot". System inserts them into tenant_services.
  3. Add-on: User is on "Pro" but adds "Chatbot". System adds "Chatbot" row to tenant_services.

Result: The application ONLY checks tenant_services. It ignores how they got there.


6. Schema Comparison: Current vs Proposed

A. Users Table

Feature Current Schema (core-database) Proposed Domain Model
Tenant Link tenant_id (Hard link) REMOVED. Users are global.
Role role (String: "owner") REMOVED. Roles are per-tenant.
Auth email, password_hash Moved to user_auth_providers (already started).

B. Access Control (New)

Entity Current Status Proposed Solution
Membership ❌ Does not exist. NEW TABLE: memberships(user_id, tenant_id, role_id)
Roles ❌ Hardcoded strings. NEW TABLE: roles(id, tenant_id, permissions)
Invites ❌ Does not exist. NEW TABLE: invitations(email, tenant_id, token)

C. Tenant Data — ✅ Implemented

Existing tables for api_keys, media_files, services, tenant_services are correctly designed for "Level 2" (Floor). They all have tenant_id FKs.

NOTE

Since this doc was written, the following tables have been fully implemented:

  • memberships
  • roles (system + custom per-tenant) ✅
  • invitations
  • Full billing schema (11 tables) ✅ — See docs/billing/BILLING_STATUS.md
  • user_notification_preferences
  • Communication schema (provider_configs, message_logs, message_events) ✅

See schema_specification.md for the complete, up-to-date schema.


4. Proposed Table categorization

🌍 Building Level (Global)

Tables that define the platform structure.

  1. users (Identity only) ✅
  2. user_auth_providers (Passwords/OAuth) ✅
  3. user_notification_preferences (Email pref toggles) ✅
  4. tenants (Floor metadata) ✅
  5. memberships (Who has keys to which floor) ✅
  6. roles (What keys open which doors — System + Custom) ✅
  7. services (List of available "Appliances" to rent) ✅
  8. invitations (Temporary passes in the Lobby) ✅
  9. plans (Subscription tiers) ✅
  10. plan_features / plan_service_limits (What each plan includes) ✅
  11. subscriptions (Active contracts) ✅
  12. verifications (OTP codes) ✅

🔒 Floor Level (Tenant Specific)

Tables that comprise the user's business data.

  1. posts (Blog Service) ✅
  2. api_keys (Gateway / Integrations) ✅
  3. media_files (Media Service) ✅
  4. tenant_services (Entitlements — which services are enabled + limits) ✅
  5. tenant_coins / coin_transactions / coin_packs (Billing economy) ✅
  6. tenant_addons / addon_catalog (Purchasable extras) ✅
  7. processed_payment_events (Webhook idempotency) ✅

📡 Communication Schema

Isolated in communication Postgres schema.

  1. communication.provider_configs
  2. communication.message_logs
  3. communication.message_events

📝 Not Yet Implemented

  1. audit_logs — Spec exists at audit_log_spec.md
Architecture