Database Design Patterns Every Developer Should Know
Database design is the foundation of any robust application. Understanding common patterns helps you make better architectural decisions and avoid common pitfalls.
Normalization vs Denormalization
When to Normalize
Normalization reduces data redundancy and improves data integrity:
-- Normalized structure
CREATE TABLE users (
id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100) UNIQUE
);
CREATE TABLE orders (
id INT PRIMARY KEY,
user_id INT REFERENCES users(id),
total DECIMAL(10,2),
created_at TIMESTAMP
);
CREATE TABLE order_items (
id INT PRIMARY KEY,
order_id INT REFERENCES orders(id),
product_id INT,
quantity INT,
price DECIMAL(10,2)
);
Benefits:
- Data consistency
- Reduced redundancy
- Easier updates
- Referential integrity
When to Denormalize
Denormalization improves read performance at the cost of write complexity:
-- Denormalized for read performance
CREATE TABLE orders (
id INT PRIMARY KEY,
user_name VARCHAR(100),
user_email VARCHAR(100),
total DECIMAL(10,2),
created_at TIMESTAMP,
-- Denormalized order items
items JSON -- [{product_id: 1, quantity: 2, price: 19.99}]
);
Benefits:
- Faster reads
- Fewer joins
- Simpler queries
- Better for analytics
Common Design Patterns
1. Audit Trail Pattern
Track all data changes for compliance and debugging:
CREATE TABLE users (
id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100),
version INT DEFAULT 0
);
CREATE TABLE user_audit (
id INT PRIMARY KEY,
user_id INT REFERENCES users(id),
action VARCHAR(10), -- INSERT, UPDATE, DELETE
old_data JSON,
new_data JSON,
changed_by INT REFERENCES users(id),
changed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Trigger to automatically log changes
CREATE TRIGGER user_audit_trigger
AFTER INSERT OR UPDATE OR DELETE ON users
FOR EACH ROW EXECUTE FUNCTION log_user_changes();
2. Soft Delete Pattern
Preserve data while marking records as deleted:
CREATE TABLE posts (
id INT PRIMARY KEY,
title VARCHAR(200),
content TEXT,
deleted_at TIMESTAMP NULL,
deleted_by INT REFERENCES users(id)
);
-- Query only active records
SELECT * FROM posts WHERE deleted_at IS NULL;
-- Soft delete
UPDATE posts SET deleted_at = NOW(), deleted_by = ? WHERE id = ?;
3. Polymorphic Association Pattern
Handle relationships to multiple table types:
CREATE TABLE comments (
id INT PRIMARY KEY,
content TEXT,
commentable_type VARCHAR(50), -- 'post', 'video', 'product'
commentable_id INT,
user_id INT REFERENCES users(id),
created_at TIMESTAMP
);
-- Get comments for posts
SELECT * FROM comments
WHERE commentable_type = 'post' AND commentable_id = ?;
-- Alternative with JSON
CREATE TABLE comments (
id INT PRIMARY KEY,
content TEXT,
commentable JSON, -- {"type": "post", "id": 123}
user_id INT REFERENCES users(id)
);
4. Event Sourcing Pattern
Store events instead of current state:
CREATE TABLE events (
id BIGINT PRIMARY KEY,
aggregate_type VARCHAR(50),
aggregate_id INT,
event_type VARCHAR(50),
event_data JSON,
event_version INT,
occurred_at TIMESTAMP
);
-- Rebuild current state from events
SELECT * FROM events
WHERE aggregate_type = 'user' AND aggregate_id = ?
ORDER BY occurred_at ASC;
5. Materialized View Pattern
Pre-compute expensive queries:
CREATE TABLE user_stats (
user_id INT PRIMARY KEY,
post_count INT,
comment_count INT,
last_activity TIMESTAMP,
updated_at TIMESTAMP
);
-- Update materialized view
CREATE OR REPLACE FUNCTION refresh_user_stats()
RETURNS TRIGGER AS $$
BEGIN
INSERT INTO user_stats (user_id, post_count, comment_count, last_activity)
SELECT
u.id,
COUNT(DISTINCT p.id) as post_count,
COUNT(DISTINCT c.id) as comment_count,
GREATEST(MAX(p.created_at), MAX(c.created_at)) as last_activity
FROM users u
LEFT JOIN posts p ON u.id = p.user_id
LEFT JOIN comments c ON u.id = c.user_id
GROUP BY u.id
ON CONFLICT (user_id) DO UPDATE SET
post_count = EXCLUDED.post_count,
comment_count = EXCLUDED.comment_count,
last_activity = EXCLUDED.last_activity,
updated_at = NOW();
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
Indexing Strategies
Composite Indexes
Create indexes that match your query patterns:
-- For queries: WHERE user_id = ? AND status = ? ORDER BY created_at
CREATE INDEX idx_user_status_created
ON orders(user_id, status, created_at DESC);
-- For queries: WHERE category = ? AND price BETWEEN ? AND ?
CREATE INDEX idx_category_price
ON products(category, price);
-- Covering index (includes all needed columns)
CREATE INDEX idx_orders_covering
ON orders(user_id, status) INCLUDE (total, created_at);
Partial Indexes
Index only relevant rows:
-- Index only active users
CREATE INDEX idx_active_users
ON users(email) WHERE deleted_at IS NULL;
-- Index recent orders
CREATE INDEX idx_recent_orders
ON orders(created_at) WHERE created_at > NOW() - INTERVAL '30 days';
Functional Indexes
Index computed values:
-- Case-insensitive search
CREATE INDEX idx_users_email_lower
ON users(LOWER(email));
-- Search by first character
CREATE INDEX idx_users_name_first
ON users(LEFT(name, 1));
Scaling Patterns
Read Replicas
Separate read and write operations:
// Node.js example with read replicas
const { Pool } = require('pg');
const writePool = new Pool({
host: 'primary-db.example.com',
database: 'app',
});
const readPool = new Pool({
host: 'replica-db.example.com',
database: 'app',
});
// Write operations
async function createUser(userData) {
const result = await writePool.query(
'INSERT INTO users (name, email) VALUES ($1, $2) RETURNING *',
[userData.name, userData.email]
);
return result.rows[0];
}
// Read operations
async function getUserById(id) {
const result = await readPool.query(
'SELECT * FROM users WHERE id = $1',
[id]
);
return result.rows[0];
}
Sharding Strategy
Distribute data across multiple databases:
// Shard by user ID
function getShard(userId) {
const shardCount = 4;
return userId % shardCount;
}
// Route to appropriate shard
async function getUserPosts(userId) {
const shard = getShard(userId);
const pool = shardPools[shard];
const result = await pool.query(
'SELECT * FROM posts WHERE user_id = $1',
[userId]
);
return result.rows;
}
CQRS Pattern
Separate read and write models:
// Write model (command)
class UserCommandService {
async createUser(userData) {
const user = await this.writeDb.save(userData);
// Emit event to update read model
await this.eventBus.emit('user_created', user);
return user;
}
}
// Read model (query)
class UserQueryService {
async getUser(id) {
return this.readDb.findById(id);
}
async searchUsers(criteria) {
return this.readDb.find(criteria);
}
}
// Event handler to update read model
class UserEventHandler {
async handleUserCreated(event) {
const readUser = {
id: event.id,
name: event.name,
email: event.email,
created_at: event.created_at
};
await this.readDb.save(readUser);
}
}
Data Migration Patterns
Blue-Green Deployment
Zero-downtime schema changes:
-- Step 1: Add new column
ALTER TABLE users ADD COLUMN new_email VARCHAR(100);
-- Step 2: Update application to write to both columns
-- (deploy new version)
-- Step 3: Backfill data
UPDATE users SET new_email = email WHERE new_email IS NULL;
-- Step 4: Switch reads to new column
-- (deploy update)
-- Step 5: Remove old column
ALTER TABLE users DROP COLUMN email;
ALTER TABLE users RENAME COLUMN new_email TO email;
Feature Flags
Control schema changes with flags:
class UserRepository {
async findByEmail(email) {
if (this.featureFlags.isEnabled('new_email_schema')) {
return this.db.query(
'SELECT * FROM users_new_schema WHERE email = $1',
[email]
);
} else {
return this.db.query(
'SELECT * FROM users WHERE email = $1',
[email]
);
}
}
}
Performance Optimization
Query Optimization
-- Use EXPLAIN to analyze queries
EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01'
GROUP BY u.id, u.name
HAVING COUNT(o.id) > 5;
-- Common table expressions for readability
WITH user_orders AS (
SELECT
user_id,
COUNT(*) as order_count,
MAX(created_at) as last_order
FROM orders
GROUP BY user_id
)
SELECT u.name, u.email, o.order_count, o.last_order
FROM users u
JOIN user_orders o ON u.id = o.user_id
WHERE o.order_count > 10;
Connection Pooling
// PgBouncer configuration
[databases]
app = host=localhost port=5432 dbname=app
[pgbouncer]
listen_port = 6432
listen_addr = 127.0.0.1
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
logfile = /var/log/pgbouncer/pgbouncer.log
admin_users = postgres
stats_users = stats, postgres
# Pool settings
pool_mode = transaction
max_client_conn = 100
default_pool_size = 20
min_pool_size = 5
reserve_pool_size = 5
reserve_pool_timeout = 5
max_db_connections = 50
max_user_connections = 50
Security Patterns
Row Level Security
-- Enable RLS on table
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
-- Policy: Users can only see their own orders
CREATE POLICY user_orders_policy ON orders
FOR ALL TO authenticated_users
USING (user_id = current_user_id());
-- Policy: Admins can see all orders
CREATE POLICY admin_orders_policy ON orders
FOR ALL TO admin_users
USING (true);
Data Encryption
// Application-level encryption
const crypto = require('crypto');
class EncryptedField {
constructor(secretKey) {
this.algorithm = 'aes-256-gcm';
this.secretKey = secretKey;
}
encrypt(text) {
const iv = crypto.randomBytes(16);
const cipher = crypto.createCipher(this.algorithm, this.secretKey);
cipher.setAAD(Buffer.from('additional-data'));
let encrypted = cipher.update(text, 'utf8', 'hex');
encrypted += cipher.final('hex');
const authTag = cipher.getAuthTag();
return {
encrypted,
iv: iv.toString('hex'),
authTag: authTag.toString('hex')
};
}
decrypt(encryptedData) {
const decipher = crypto.createDecipher(this.algorithm, this.secretKey);
decipher.setAAD(Buffer.from('additional-data'));
decipher.setAuthTag(Buffer.from(encryptedData.authTag, 'hex'));
let decrypted = decipher.update(encryptedData.encrypted, 'hex', 'utf8');
decrypted += decipher.final('utf8');
return decrypted;
}
}
Best Practices
Schema Design Guidelines
- Use appropriate data types - Don't use VARCHAR for everything
- Add constraints - Ensure data integrity at the database level
- Index strategically - Don't over-index
- Plan for growth - Consider future requirements
- Document your schema - Use comments and diagrams
Query Writing Guidelines
- **Avoid SELECT *** - Specify only needed columns
- Use parameterized queries - Prevent SQL injection
- Limit result sets - Use pagination
- Batch operations - Reduce round trips
- Monitor performance - Use query logs
Conclusion
Database design is both an art and a science. Understanding these patterns helps you make informed decisions about data architecture. Remember that the best pattern depends on your specific requirements, scale, and constraints.
Start with simple, normalized designs and evolve them as your understanding of the domain grows. The key is to remain flexible and willing to refactor when patterns no longer serve your needs.
Good database design isn't about finding the perfect schema—it's about creating a foundation that supports your application's growth and evolution.