Designing a Multi-Tenant ERP: Lessons from 15+ Company Deployments
Ahad Nawaz3 min read
What I learned shipping the same ERP into 15+ companies: tenant isolation, schema-per-company versus shared, audit trails, the migration story, and the boring decisions that keep the system from melting.
An ERP suite has been deployed across 15+ companies over the past two years. Inventory, sales, purchases, finance, HR, payroll. Same codebase, different industries, wildly different scales. Here are the choices that actually mattered.
The First Real Question: Schema Per Tenant or Shared
I evaluated three patterns:
- Database per tenant. Best isolation. Painful backups, migrations, and infra cost at scale.
- Schema per tenant in one database. Decent isolation. Migration tools must loop schemas.
- Shared schema with a tenant column. Cheapest. Riskiest if you mess up filtering.
I picked shared schema with strict tenant scoping. The reasons: cheaper to back up, one migration set, predictable performance. The risk gets mitigated by a hard rule: every Prisma query goes through a repository function that injects tenant_id.
// every model has tenant_id
model Invoice {
id String @id
tenantId String
amount Decimal
status InvoiceStatus
createdAt DateTime @default(now())
@@index([tenantId, status])
}
Every index starts with tenant_id. Postgres plans are tight because the planner treats tenant_id as a selective leading column.
Audit Trails Without Slowing Writes
Finance teams need to know who changed what, when, and what the old value was. Two patterns I have shipped:
Trigger-Based Shadow Tables
Postgres triggers write to a {table}_history mirror on every update. The application writes nothing extra. Cost is one extra row per change, one extra index per shadow table. Pro: cannot be skipped. Con: schema changes touch the trigger too.
App-Level Audit Log
The repository writes an audit_event row alongside the main write inside one transaction. Easier to evolve, but if a developer skips the repo function the audit is missing. I use this for non-financial entities where speed matters and the risk is low.
I run both in the same ERP, picking per domain.
Migrations: The Single Most Underestimated Pain
Fifteen tenants on the same schema means every migration runs against fifteen sets of real data. The boring rules that saved me:
- Expand then contract. Add the new column nullable. Backfill in a job. Make it not-null in a separate migration. Never break old code paths in step one.
- Backfills are batched. Update 1,000 rows, sleep, repeat. Long single transactions block writes and freeze the UI.
- Every migration is reversible in code. Even if you never run the down, writing it forces you to think through state.
Permissions: Role-Based, Capability-Driven
Different industries want different permissions. I model two layers:
- Roles are user-facing buckets, configurable per tenant:
Owner,Accountant,Warehouse Lead. - Capabilities are atomic permissions the code checks against, like
invoice.read,invoice.create,warehouse.transfer.
Roles map to capability sets in the database. Every server action calls requireCapability(ctx, "invoice.create") before mutating. The mapping changes without code, the code never asks "what role is this user."
The Reporting Problem
Operational queries hit the live OLTP database. Reporting queries (P&L over six months, sales pivots) used to lock tables and time out. I solved it in two stages:
- Read replicas for heavy reports. Reports read from a streaming replica. Stale by seconds, no impact on live transactions.
- Materialized summaries refreshed on event. Daily revenue, monthly inventory turnover, stock valuation get rolled up to summary tables. Reports hit those instead of recomputing.
What I Got Wrong
I let one large tenant push the inventory module toward their workflow. Six months later, three smaller tenants had to work around features they did not need. The lesson: configurable, not customized. If a feature is for one tenant, it lives behind a feature flag scoped to their tenant id. The codebase stays clean for everyone else.
What Comes With the Stack
Backend: NestJS, Prisma, Postgres, Redis for cache and queues. Frontend: Next.js with role-aware layouts. Background jobs: BullMQ workers for backfills, exports, and scheduled reports.
If you are building multi-tenant SaaS or ERP, the boring trio of shared schema + tenant scoping + capability checks gets you 90% of the way. Save your creativity for the product, not the architecture.
Comments
Sign in to leave a comment.