Architecture

Multi-Tenant Data: Shared, Schema-Per, DB-Per

Ravinder·September 22, 2025·10 min read

ArchitectureMulti-TenancySaaSDatabase

Multi-Tenant Data: Shared, Schema-Per, DB-Per

The Choice That Determines Everything Downstream

The multi-tenancy data model you choose on day one shapes your query patterns, your security posture, your compliance overhead, your backup strategy, your migration complexity, and your infrastructure cost for the life of the product. It is not a decision you reverse cheaply.

I have seen teams choose wrong in both directions. Startups that picked database-per-tenant when they had 50 customers and burned $40K/month on RDS instances before Series A. Enterprise SaaS companies that chose a shared database and then spent eight months retrofitting row-level security when a Fortune 500 enterprise contract required data isolation as a contractual obligation.

The decision framework is not complicated. The mistake is treating it as a simple cost question rather than a multi-axis tradeoff. This post gives you the model to think through all three approaches — shared, schema-per, and database-per — and the patterns for hybrid models and migration between tiers.

The Three Models

flowchart TD subgraph SharedDB ["Shared Database, Shared Schema"] ST1[(orders\ntenant_id = 'A')] & ST2[(orders\ntenant_id = 'B')] & ST3[(orders\ntenant_id = 'C')] ST1 & ST2 & ST3 --> SharedTable[(orders table\n+ tenant_id column)] end subgraph SchemaPer ["Shared Database, Schema-Per-Tenant"] SP1[(tenant_a.orders)] & SP2[(tenant_b.orders)] & SP3[(tenant_c.orders)] SP1 & SP2 & SP3 --> SharedDB2[(Single PostgreSQL instance)] end subgraph DBPer ["Database-Per-Tenant"] DB1[(Tenant A DB)] & DB2[(Tenant B DB)] & DB3[(Tenant C DB)] end

Shared Database, Shared Schema

All tenants live in the same tables. Rows are segregated by a tenant_id column. This is the starting point for almost every SaaS product, and it is the right starting point.

-- Shared schema: every table has tenant_id
CREATE TABLE orders (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL REFERENCES tenants(id),
    status      TEXT NOT NULL,
    total_cents BIGINT NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);
 
-- Required: composite indexes with tenant_id first
CREATE INDEX idx_orders_tenant_status
ON orders (tenant_id, status, created_at DESC);
 
-- Row-level security: database enforces isolation even if application code forgets
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
 
CREATE POLICY tenant_isolation ON orders
    USING (tenant_id = current_setting('app.tenant_id')::uuid);

# Application must set tenant context on every connection
import contextlib
from typing import Generator
import psycopg2
 
@contextlib.contextmanager
def tenant_connection(tenant_id: str) -> Generator:
    conn = get_db_connection()
    try:
        with conn.cursor() as cur:
            # Set tenant context — RLS policy uses this
            cur.execute("SET LOCAL app.tenant_id = %s", (tenant_id,))
        yield conn
    finally:
        conn.close()
 
# Usage
with tenant_connection("tenant-abc-123") as conn:
    orders = fetch_orders(conn)
    # RLS ensures only tenant-abc-123's orders are visible

Strengths: lowest infrastructure cost, highest connection pool efficiency, single migration path, simplest backup and restore, easiest to scale horizontally with read replicas.

Weaknesses: tenant data is logically separated but physically co-located — a bug in tenant_id filtering leaks data across tenants. Performance noisy-neighbour effects: a large tenant's queries affect everyone. Schema changes affect all tenants simultaneously. Compliance: data residency requirements cannot be met.

Schema-Per-Tenant

Each tenant gets a dedicated schema (namespace) within a shared database instance. Tables have the same structure across schemas; the database routes queries to the correct schema based on the active search path.

-- Create schema for a new tenant
CREATE SCHEMA tenant_abc123;
 
-- Create tables in that schema
CREATE TABLE tenant_abc123.orders (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    status      TEXT NOT NULL,
    total_cents BIGINT NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
    -- No tenant_id needed — schema IS the tenant boundary
);

# Application sets search_path to tenant's schema
@contextlib.contextmanager
def tenant_connection(tenant_id: str) -> Generator:
    schema = f"tenant_{tenant_id.replace('-', '_')}"
    conn = get_db_connection()
    try:
        with conn.cursor() as cur:
            cur.execute(f"SET LOCAL search_path TO {schema}, public")
        yield conn
    finally:
        conn.close()

Strengths: stronger logical isolation than shared schema (no tenant_id leak surface), independent schema evolution per tenant is possible, easier to dump/restore a single tenant's data, tenant-specific indexes without affecting others.

Weaknesses: PostgreSQL has soft limits around 10,000 schemas per database before management overhead becomes painful. Schema migrations must run per-tenant — 1,000 tenants means 1,000 migration runs. Connection pooling complexity increases (PgBouncer in transaction mode cannot use schema-scoped connections cleanly). No data residency isolation — everything is still on the same instance.

Database-Per-Tenant

Each tenant has a fully dedicated database instance. Complete physical isolation.

# Tenant database routing — connection per tenant
from functools import lru_cache
import sqlalchemy
 
TENANT_DB_CONFIGS = {
    "tenant-a": "postgresql://user:pass@db-tenant-a.internal/app",
    "tenant-b": "postgresql://user:pass@db-tenant-b.internal/app",
    # loaded from secrets manager in practice
}
 
@lru_cache(maxsize=None)
def get_tenant_engine(tenant_id: str) -> sqlalchemy.Engine:
    dsn = TENANT_DB_CONFIGS.get(tenant_id)
    if not dsn:
        raise TenantNotFoundError(tenant_id)
    return sqlalchemy.create_engine(dsn, pool_size=5, max_overflow=10)
 
def get_tenant_session(tenant_id: str):
    engine = get_tenant_engine(tenant_id)
    return sqlalchemy.orm.Session(engine)

Strengths: complete isolation — data breach in one tenant's DB cannot affect others, independent scaling (give high-volume tenants a larger instance), data residency requirements satisfied (put the DB in the right region), independent backup schedules, no noisy-neighbour effects.

Weaknesses: expensive. At $50–200/month per managed DB instance, 500 tenants costs $25K–100K/month before compute. Schema migrations require a migration runner that coordinates across all tenant databases. Monitoring surface multiplies by tenant count. Provisioning a new tenant requires infrastructure automation rather than a SQL statement.

The Tradeoff Matrix

Dimension	Shared Schema	Schema-Per	DB-Per
Infrastructure cost	Lowest	Low–Medium	Highest
Data isolation	Logical (RLS)	Logical (schema)	Physical
Noisy neighbour	Yes	Yes	No
Schema migration complexity	Single run	Per-tenant run	Per-tenant run + infra
Data residency compliance	No	No	Yes
Tenant-specific customisation	Hard	Possible	Easy
Max practical tenants	Millions	~10,000	~1,000 (managed cost)
Onboarding new tenant	SQL INSERT	CREATE SCHEMA	Provision DB
Backup/restore single tenant	Hard	Medium	Easy

Hybrid Models: Tiered Tenancy

The most practical production answer is a tiered model. SMB customers share a pool; enterprise customers get isolation.

flowchart TD subgraph Tier1 ["Tier 1 — SMB / Free"] SharedPool[(Shared database\nshared schema\n+ RLS\nthousands of tenants)] end subgraph Tier2 ["Tier 2 — Growth"] SchemaPool[(Shared database\nschema-per-tenant\nhundreds of tenants)] end subgraph Tier3 ["Tier 3 — Enterprise"] E1[(Dedicated DB\nTenant X)] E2[(Dedicated DB\nTenant Y)] E3[(Dedicated DB\nTenant Z)] end Router["Routing Layer\n(reads tenant tier\nfrom config service)"] --> Tier1 Router --> Tier2 Router --> Tier3 UpgradePath["Tenant upgrade:\nSMB → Growth → Enterprise"] -.-> Router

# Routing layer — abstracts tier from application code
from enum import Enum
from dataclasses import dataclass
 
class TenantTier(Enum):
    SMB        = "smb"
    GROWTH     = "growth"
    ENTERPRISE = "enterprise"
 
@dataclass
class TenantConfig:
    tenant_id: str
    tier: TenantTier
    db_dsn: str        # for enterprise
    schema_name: str   # for growth
    # SMB uses shared connection pool — no extra config
 
class TenantRouter:
    def __init__(self, config_service):
        self.config_service = config_service
 
    def get_connection(self, tenant_id: str):
        config = self.config_service.get_tenant(tenant_id)
 
        if config.tier == TenantTier.ENTERPRISE:
            return dedicated_connection(config.db_dsn)
        elif config.tier == TenantTier.GROWTH:
            return schema_connection(config.schema_name)
        else:
            return shared_connection_with_rls(tenant_id)

This model lets you start all tenants in the shared pool, upgrade individual tenants as they grow or as compliance requires, and charge a premium for the isolation tier — which usually covers the infrastructure cost increase.

Migration Between Tiers

Migrating a tenant from shared schema to dedicated database is the most operationally painful thing in multi-tenancy. Plan for it from day one even if you do not execute it immediately.

The sequence for migrating a tenant from shared to dedicated:

sequenceDiagram participant App participant SharedDB participant NewDB App->>SharedDB: 1. Export tenant data (pg_dump --schema-only + COPY) App->>NewDB: 2. Import schema App->>NewDB: 3. Import data App->>SharedDB: 4. Enable replication / CDC for this tenant's rows SharedDB->>NewDB: 5. Stream changes (Debezium / logical replication) App->>App: 6. Route new writes to NewDB (feature flag per tenant) App->>App: 7. Verify: read from both, compare checksums App->>App: 8. Cut over reads to NewDB App->>SharedDB: 9. DELETE tenant rows (or archive) App->>SharedDB: 10. Remove tenant_id from RLS policies

The key insight: step 6 is a routing change, not a code change. Your routing layer must be able to switch a single tenant to a new database without a deployment. This means tenant configuration must live in a runtime-configurable store (database, feature flag system) — not in environment variables or code.

# Tenant config in a runtime-configurable store
import redis
import json
from typing import Optional
 
class TenantConfigStore:
    def __init__(self, redis_client: redis.Redis):
        self.redis = redis_client
 
    def get_tenant_config(self, tenant_id: str) -> Optional[dict]:
        raw = self.redis.get(f"tenant:config:{tenant_id}")
        if raw:
            return json.loads(raw)
        return None
 
    def set_tenant_db(self, tenant_id: str, dsn: str, tier: str):
        """
        Zero-downtime tier migration: update config, traffic follows.
        Next request from this tenant uses the new database.
        """
        config = self.get_tenant_config(tenant_id) or {}
        config.update({"dsn": dsn, "tier": tier})
        self.redis.set(
            f"tenant:config:{tenant_id}",
            json.dumps(config),
            ex=300  # 5-minute TTL — refresh from primary store
        )

Pitfalls That Actually Hit Production

Missing composite indexes: In shared schema, every query against a large table without tenant_id as the leading index column will sequential-scan the entire table for small tenants — or be catastrophically slow for large ones. Audit every index for tenant_id placement.

Background jobs that skip tenant scoping: Scheduled jobs that run SELECT * FROM orders WHERE status = 'pending' without a tenant_id filter bypass RLS in many configurations and expose all tenant data. Always set tenant context, even in background jobs.

Migration runners that do not handle partial failure: When running migrations across 500 tenant schemas, some will fail (connection error, lock timeout, disk space). Your migration runner must be idempotent, must log per-tenant success/failure, and must be re-runnable for the failed subset.

# Tenant-aware migration runner with partial failure handling
import logging
from typing import List
 
logger = logging.getLogger(__name__)
 
def run_migration_for_all_tenants(
    migration_fn,
    tenant_ids: List[str],
    dry_run: bool = False
) -> dict:
    results = {"success": [], "failed": [], "skipped": []}
 
    for tenant_id in tenant_ids:
        try:
            if migration_already_applied(tenant_id, migration_fn.__name__):
                results["skipped"].append(tenant_id)
                continue
 
            if not dry_run:
                with tenant_connection(tenant_id) as conn:
                    migration_fn(conn)
                    record_migration_applied(tenant_id, migration_fn.__name__)
 
            results["success"].append(tenant_id)
            logger.info(f"Migration applied: tenant={tenant_id}")
 
        except Exception as e:
            results["failed"].append({"tenant_id": tenant_id, "error": str(e)})
            logger.error(f"Migration failed: tenant={tenant_id} error={e}")
            # Do NOT re-raise — continue with remaining tenants
 
    return results

Key Takeaways

Shared-schema multi-tenancy is the right default for startups and mid-stage SaaS — the lowest cost, simplest operations, and easiest to scale — as long as row-level security is implemented correctly at the database level, not only in application code.
Schema-per-tenant provides stronger logical isolation than shared schema and enables limited per-tenant schema customisation, but becomes operationally expensive past a few hundred tenants due to migration complexity and PostgreSQL schema management overhead.
Database-per-tenant is the only model that satisfies data residency compliance, complete noisy-neighbour isolation, and independent scaling — at a cost that typically only makes sense for enterprise-tier or regulated-industry customers.
Tiered tenancy — SMB in shared pool, growth in schema-per, enterprise in dedicated DB — is the practical production answer for SaaS products serving a wide market segment range.
Migration between tiers requires runtime-configurable tenant routing from day one; tenant database configuration must live in a dynamic store, not in code or environment variables.
The biggest operational pitfalls are missing composite indexes with tenant_id as the leading column, background jobs that bypass tenant scoping, and migration runners that lack per-tenant idempotency and partial-failure recovery.