Multi-Tenant Data: Shared, Schema-Per, DB-Per
The Choice That Determines Everything Downstream
The multi-tenancy data model you choose on day one shapes your query patterns, your security posture, your compliance overhead, your backup strategy, your migration complexity, and your infrastructure cost for the life of the product. It is not a decision you reverse cheaply.
I have seen teams choose wrong in both directions. Startups that picked database-per-tenant when they had 50 customers and burned $40K/month on RDS instances before Series A. Enterprise SaaS companies that chose a shared database and then spent eight months retrofitting row-level security when a Fortune 500 enterprise contract required data isolation as a contractual obligation.
The decision framework is not complicated. The mistake is treating it as a simple cost question rather than a multi-axis tradeoff. This post gives you the model to think through all three approaches — shared, schema-per, and database-per — and the patterns for hybrid models and migration between tiers.
The Three Models
Shared Database, Shared Schema
All tenants live in the same tables. Rows are segregated by a tenant_id column. This is the starting point for almost every SaaS product, and it is the right starting point.
-- Shared schema: every table has tenant_id
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id),
status TEXT NOT NULL,
total_cents BIGINT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Required: composite indexes with tenant_id first
CREATE INDEX idx_orders_tenant_status
ON orders (tenant_id, status, created_at DESC);
-- Row-level security: database enforces isolation even if application code forgets
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON orders
USING (tenant_id = current_setting('app.tenant_id')::uuid);# Application must set tenant context on every connection
import contextlib
from typing import Generator
import psycopg2
@contextlib.contextmanager
def tenant_connection(tenant_id: str) -> Generator:
conn = get_db_connection()
try:
with conn.cursor() as cur:
# Set tenant context — RLS policy uses this
cur.execute("SET LOCAL app.tenant_id = %s", (tenant_id,))
yield conn
finally:
conn.close()
# Usage
with tenant_connection("tenant-abc-123") as conn:
orders = fetch_orders(conn)
# RLS ensures only tenant-abc-123's orders are visibleStrengths: lowest infrastructure cost, highest connection pool efficiency, single migration path, simplest backup and restore, easiest to scale horizontally with read replicas.
Weaknesses: tenant data is logically separated but physically co-located — a bug in tenant_id filtering leaks data across tenants. Performance noisy-neighbour effects: a large tenant's queries affect everyone. Schema changes affect all tenants simultaneously. Compliance: data residency requirements cannot be met.
Schema-Per-Tenant
Each tenant gets a dedicated schema (namespace) within a shared database instance. Tables have the same structure across schemas; the database routes queries to the correct schema based on the active search path.
-- Create schema for a new tenant
CREATE SCHEMA tenant_abc123;
-- Create tables in that schema
CREATE TABLE tenant_abc123.orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
status TEXT NOT NULL,
total_cents BIGINT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
-- No tenant_id needed — schema IS the tenant boundary
);# Application sets search_path to tenant's schema
@contextlib.contextmanager
def tenant_connection(tenant_id: str) -> Generator:
schema = f"tenant_{tenant_id.replace('-', '_')}"
conn = get_db_connection()
try:
with conn.cursor() as cur:
cur.execute(f"SET LOCAL search_path TO {schema}, public")
yield conn
finally:
conn.close()Strengths: stronger logical isolation than shared schema (no tenant_id leak surface), independent schema evolution per tenant is possible, easier to dump/restore a single tenant's data, tenant-specific indexes without affecting others.
Weaknesses: PostgreSQL has soft limits around 10,000 schemas per database before management overhead becomes painful. Schema migrations must run per-tenant — 1,000 tenants means 1,000 migration runs. Connection pooling complexity increases (PgBouncer in transaction mode cannot use schema-scoped connections cleanly). No data residency isolation — everything is still on the same instance.
Database-Per-Tenant
Each tenant has a fully dedicated database instance. Complete physical isolation.
# Tenant database routing — connection per tenant
from functools import lru_cache
import sqlalchemy
TENANT_DB_CONFIGS = {
"tenant-a": "postgresql://user:pass@db-tenant-a.internal/app",
"tenant-b": "postgresql://user:pass@db-tenant-b.internal/app",
# loaded from secrets manager in practice
}
@lru_cache(maxsize=None)
def get_tenant_engine(tenant_id: str) -> sqlalchemy.Engine:
dsn = TENANT_DB_CONFIGS.get(tenant_id)
if not dsn:
raise TenantNotFoundError(tenant_id)
return sqlalchemy.create_engine(dsn, pool_size=5, max_overflow=10)
def get_tenant_session(tenant_id: str):
engine = get_tenant_engine(tenant_id)
return sqlalchemy.orm.Session(engine)Strengths: complete isolation — data breach in one tenant's DB cannot affect others, independent scaling (give high-volume tenants a larger instance), data residency requirements satisfied (put the DB in the right region), independent backup schedules, no noisy-neighbour effects.
Weaknesses: expensive. At $50–200/month per managed DB instance, 500 tenants costs $25K–100K/month before compute. Schema migrations require a migration runner that coordinates across all tenant databases. Monitoring surface multiplies by tenant count. Provisioning a new tenant requires infrastructure automation rather than a SQL statement.
The Tradeoff Matrix
| Dimension | Shared Schema | Schema-Per | DB-Per |
|---|---|---|---|
| Infrastructure cost | Lowest | Low–Medium | Highest |
| Data isolation | Logical (RLS) | Logical (schema) | Physical |
| Noisy neighbour | Yes | Yes | No |
| Schema migration complexity | Single run | Per-tenant run | Per-tenant run + infra |
| Data residency compliance | No | No | Yes |
| Tenant-specific customisation | Hard | Possible | Easy |
| Max practical tenants | Millions | ~10,000 | ~1,000 (managed cost) |
| Onboarding new tenant | SQL INSERT | CREATE SCHEMA | Provision DB |
| Backup/restore single tenant | Hard | Medium | Easy |
Hybrid Models: Tiered Tenancy
The most practical production answer is a tiered model. SMB customers share a pool; enterprise customers get isolation.
# Routing layer — abstracts tier from application code
from enum import Enum
from dataclasses import dataclass
class TenantTier(Enum):
SMB = "smb"
GROWTH = "growth"
ENTERPRISE = "enterprise"
@dataclass
class TenantConfig:
tenant_id: str
tier: TenantTier
db_dsn: str # for enterprise
schema_name: str # for growth
# SMB uses shared connection pool — no extra config
class TenantRouter:
def __init__(self, config_service):
self.config_service = config_service
def get_connection(self, tenant_id: str):
config = self.config_service.get_tenant(tenant_id)
if config.tier == TenantTier.ENTERPRISE:
return dedicated_connection(config.db_dsn)
elif config.tier == TenantTier.GROWTH:
return schema_connection(config.schema_name)
else:
return shared_connection_with_rls(tenant_id)This model lets you start all tenants in the shared pool, upgrade individual tenants as they grow or as compliance requires, and charge a premium for the isolation tier — which usually covers the infrastructure cost increase.
Migration Between Tiers
Migrating a tenant from shared schema to dedicated database is the most operationally painful thing in multi-tenancy. Plan for it from day one even if you do not execute it immediately.
The sequence for migrating a tenant from shared to dedicated:
The key insight: step 6 is a routing change, not a code change. Your routing layer must be able to switch a single tenant to a new database without a deployment. This means tenant configuration must live in a runtime-configurable store (database, feature flag system) — not in environment variables or code.
# Tenant config in a runtime-configurable store
import redis
import json
from typing import Optional
class TenantConfigStore:
def __init__(self, redis_client: redis.Redis):
self.redis = redis_client
def get_tenant_config(self, tenant_id: str) -> Optional[dict]:
raw = self.redis.get(f"tenant:config:{tenant_id}")
if raw:
return json.loads(raw)
return None
def set_tenant_db(self, tenant_id: str, dsn: str, tier: str):
"""
Zero-downtime tier migration: update config, traffic follows.
Next request from this tenant uses the new database.
"""
config = self.get_tenant_config(tenant_id) or {}
config.update({"dsn": dsn, "tier": tier})
self.redis.set(
f"tenant:config:{tenant_id}",
json.dumps(config),
ex=300 # 5-minute TTL — refresh from primary store
)Pitfalls That Actually Hit Production
Missing composite indexes: In shared schema, every query against a large table without tenant_id as the leading index column will sequential-scan the entire table for small tenants — or be catastrophically slow for large ones. Audit every index for tenant_id placement.
Background jobs that skip tenant scoping: Scheduled jobs that run SELECT * FROM orders WHERE status = 'pending' without a tenant_id filter bypass RLS in many configurations and expose all tenant data. Always set tenant context, even in background jobs.
Migration runners that do not handle partial failure: When running migrations across 500 tenant schemas, some will fail (connection error, lock timeout, disk space). Your migration runner must be idempotent, must log per-tenant success/failure, and must be re-runnable for the failed subset.
# Tenant-aware migration runner with partial failure handling
import logging
from typing import List
logger = logging.getLogger(__name__)
def run_migration_for_all_tenants(
migration_fn,
tenant_ids: List[str],
dry_run: bool = False
) -> dict:
results = {"success": [], "failed": [], "skipped": []}
for tenant_id in tenant_ids:
try:
if migration_already_applied(tenant_id, migration_fn.__name__):
results["skipped"].append(tenant_id)
continue
if not dry_run:
with tenant_connection(tenant_id) as conn:
migration_fn(conn)
record_migration_applied(tenant_id, migration_fn.__name__)
results["success"].append(tenant_id)
logger.info(f"Migration applied: tenant={tenant_id}")
except Exception as e:
results["failed"].append({"tenant_id": tenant_id, "error": str(e)})
logger.error(f"Migration failed: tenant={tenant_id} error={e}")
# Do NOT re-raise — continue with remaining tenants
return resultsKey Takeaways
- Shared-schema multi-tenancy is the right default for startups and mid-stage SaaS — the lowest cost, simplest operations, and easiest to scale — as long as row-level security is implemented correctly at the database level, not only in application code.
- Schema-per-tenant provides stronger logical isolation than shared schema and enables limited per-tenant schema customisation, but becomes operationally expensive past a few hundred tenants due to migration complexity and PostgreSQL schema management overhead.
- Database-per-tenant is the only model that satisfies data residency compliance, complete noisy-neighbour isolation, and independent scaling — at a cost that typically only makes sense for enterprise-tier or regulated-industry customers.
- Tiered tenancy — SMB in shared pool, growth in schema-per, enterprise in dedicated DB — is the practical production answer for SaaS products serving a wide market segment range.
- Migration between tiers requires runtime-configurable tenant routing from day one; tenant database configuration must live in a dynamic store, not in code or environment variables.
- The biggest operational pitfalls are missing composite indexes with tenant_id as the leading column, background jobs that bypass tenant scoping, and migration runners that lack per-tenant idempotency and partial-failure recovery.