PostgreSQL Scales Further Than You Think — Here is How to Push It
Practical strategies for scaling PostgreSQL before reaching for distributed databases or microservice rewrites
There's a moment in every growing project where someone says, "We need to move off Postgres." Usually it's after a few slow queries show up in the logs, or the connection count starts climbing. The instinct is to reach for something exotic — CockroachDB, Vitess, a managed NoSQL thing with a slick landing page.
Hold on. PostgreSQL can handle far more than most teams ever ask of it. Before you introduce operational complexity that your team isn't ready for, let's talk about what you can actually do with the database you already have.
Connection pooling: the first bottleneck you'll hit
PostgreSQL forks a process per connection. That's fine for 50 connections. At 500, your server starts sweating. At 2000, it's on fire.
PgBouncer solves this. It sits in front of Postgres and multiplexes connections. Your app thinks it has 200 connections; Postgres only sees 20.
; pgbouncer.ini
[databases]
myapp = host=127.0.0.1 port=5432 dbname=myapp
[pgbouncer]
listen_port = 6432
listen_addr = 0.0.0.0
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25The key setting is pool_mode = transaction. This means connections are returned to the pool after each transaction, not after each session. It's more efficient, but it means you can't use prepared statements across transactions (unless you use protocol_native mode in newer PgBouncer versions or switch to pgcat).
One thing people overlook: if you're running on Kubernetes and each pod opens its own connection pool, you can easily end up with thousands of connections even with pooling at the app level. Put PgBouncer between your pods and Postgres, not inside each pod.
Indexing: the 80% solution
Before anything else, look at your queries. Most "scaling problems" are actually "nobody added an index" problems.
-- Find the slowest queries
SELECT
calls,
round(total_exec_time::numeric, 2) AS total_ms,
round(mean_exec_time::numeric, 2) AS avg_ms,
query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;Make sure pg_stat_statements is enabled — it's the single most valuable extension for performance work.
Once you know what's slow, check if indexes exist:
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE customer_id = 42 AND status = 'pending';If you see a sequential scan on a large table, you probably need:
CREATE INDEX CONCURRENTLY idx_orders_customer_status
ON orders (customer_id, status);CONCURRENTLY is important — it doesn't lock the table during creation. Skip it and your app might stall.
Partial indexes are underused and powerful. If 90% of your queries only care about active records:
CREATE INDEX idx_orders_active ON orders (customer_id)
WHERE status != 'archived';Smaller index, faster lookups, less disk I/O.
Partitioning: when tables get unwieldy
Once a table crosses tens of millions of rows, even good indexes start to slow down. Partition it.
PostgreSQL's declarative partitioning (available since v10, mature since v12) is straightforward:
CREATE TABLE events (
id bigint GENERATED ALWAYS AS IDENTITY,
created_at timestamptz NOT NULL,
event_type text NOT NULL,
payload jsonb
) PARTITION BY RANGE (created_at);
CREATE TABLE events_2026_01 PARTITION OF events
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02 PARTITION OF events
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
CREATE TABLE events_2026_03 PARTITION OF events
FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');The query planner automatically prunes partitions it doesn't need. A query filtering on created_at will only scan the relevant month's partition.
Automate partition creation. Don't create them by hand each month — use pg_partman or a simple cron job:
#!/bin/bash
# Run monthly to create next month's partition
NEXT_MONTH=$(date -d "+1 month" +%Y-%m-01)
MONTH_AFTER=$(date -d "+2 months" +%Y-%m-01)
TABLE="events_$(date -d '+1 month' +%Y_%m)"
psql -d myapp -c "
CREATE TABLE IF NOT EXISTS $TABLE PARTITION OF events
FOR VALUES FROM ('$NEXT_MONTH') TO ('$MONTH_AFTER');
"A practical note: partitioning makes DELETE obsolete for old data. Instead of deleting millions of rows (which is slow and generates tons of WAL), just drop the old partition:
DROP TABLE events_2025_01;Instant. No vacuum needed.
Read replicas: split the load
Your app probably reads way more than it writes. Streaming replication in PostgreSQL is rock solid and simple to set up.
The architecture: one primary handles writes, one or more replicas handle reads. Your app routes queries accordingly.
At the application level, this can be as simple as two connection strings:
import psycopg2
write_conn = psycopg2.connect("host=primary dbname=myapp")
read_conn = psycopg2.connect("host=replica1 dbname=myapp")
def get_orders(customer_id):
# Reads go to replica
with read_conn.cursor() as cur:
cur.execute("SELECT * FROM orders WHERE customer_id = %s", (customer_id,))
return cur.fetchall()
def place_order(customer_id, items):
# Writes go to primary
with write_conn.cursor() as cur:
cur.execute(
"INSERT INTO orders (customer_id, items) VALUES (%s, %s)",
(customer_id, items)
)
write_conn.commit()The caveat is replication lag. Writes to the primary take a few milliseconds to appear on replicas. If a user creates a record and immediately reads it back, they might not see it. Solutions:
- Read-your-own-writes: route reads to the primary for a short window after a write
- Synchronous replication: eliminates lag but adds latency to writes — usually not worth it
- Just accept it: for dashboards, reports, search results, a few ms of lag rarely matters
TOAST and large values
If you're storing large JSON blobs or text, PostgreSQL automatically compresses and stores them out-of-line using TOAST (The Oversized-Attribute Storage Technique). This is mostly transparent, but it means wide rows don't slow down scans of narrow columns.
However, if you're doing SELECT * on tables with large JSONB columns, you're decompressing data you might not need. Be explicit:
-- Don't do this
SELECT * FROM events WHERE created_at > now() - interval '1 hour';
-- Do this
SELECT id, event_type, created_at FROM events WHERE created_at > now() - interval '1 hour';Materialized views for expensive aggregations
Some queries are inherently expensive. Monthly revenue reports, leaderboard calculations, aggregations across millions of rows. Don't run them on every request.
CREATE MATERIALIZED VIEW monthly_revenue AS
SELECT
date_trunc('month', created_at) AS month,
sum(amount) AS total_revenue,
count(*) AS order_count
FROM orders
WHERE status = 'completed'
GROUP BY 1;
-- Refresh it periodically (e.g., every hour via cron)
REFRESH MATERIALIZED VIEW CONCURRENTLY monthly_revenue;CONCURRENTLY lets the view stay readable while refreshing. You need a unique index on the materialized view for this to work:
CREATE UNIQUE INDEX idx_monthly_revenue_month ON monthly_revenue (month);This is simpler and more reliable than maintaining a separate cache layer for most use cases.
Vacuum and autovacuum tuning
PostgreSQL's MVCC means dead rows accumulate. Autovacuum cleans them up, but the defaults are conservative. On write-heavy tables, you may need to be more aggressive:
ALTER TABLE events SET (
autovacuum_vacuum_scale_factor = 0.01, -- default 0.2
autovacuum_analyze_scale_factor = 0.005, -- default 0.1
autovacuum_vacuum_cost_delay = 2 -- default 2ms (was 20ms in older versions)
);For large tables, the default "vacuum when 20% of rows are dead" means you could have 20 million dead rows before vacuum kicks in. Lowering the scale factor fixes this.
Monitor it:
SELECT
relname,
n_dead_tup,
n_live_tup,
round(n_dead_tup::numeric / greatest(n_live_tup, 1) * 100, 2) AS dead_pct,
last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 10;What's actually overrated
Sharding at the application level. If you're not at the scale of Notion or Instagram (both PostgreSQL users, by the way), manual sharding adds enormous complexity. You now need to handle cross-shard queries, migrations, rebalancing — and your team of five isn't ready for that.
Switching to NoSQL for performance. I've seen teams move from PostgreSQL to MongoDB because "it's faster," then spend months rebuilding the query capabilities they lost. PostgreSQL with proper indexing is fast enough for most workloads.
Citus/distributed PostgreSQL too early. Citus is great when you genuinely need horizontal write scaling. But it changes how you model data (everything needs a distribution key), and some queries become impossible or slow. Try everything else first.
The practical scaling ladder
Here's roughly the order I'd recommend:
- Enable
pg_stat_statementsand fix your queries. This alone solves most problems. - Add proper indexes (including partial and covering indexes).
- Set up PgBouncer for connection pooling.
- Tune
work_mem,shared_buffers,effective_cache_sizefor your hardware. Use PGTune as a starting point. - Add read replicas to offload read traffic.
- Partition large tables that grow unboundedly.
- Use materialized views for expensive aggregations.
- Only then consider Citus, sharding, or alternative databases.
Most applications will never get past step 5. And that's fine — it means PostgreSQL is doing its job. The database that's boring and reliable is exactly the one you want running your production workload.