How to Upgrade PostgreSQL Without Losing Data

I've done this dance a few times. Here are the main approaches to upgrading PostgreSQL major versions, what to watch out for, and which one to probably reach for first.

Why Major Version Upgrades Are Different

Minor version upgrades (14.1 to 14.2) are painless — the on-disk format stays the same, we swap binaries and restart. Major version upgrades (14 to 16) change the internal data storage format, so we cannot point the new binary at the old data directory.

There are three real options:

pg_dump / pg_restore — the classic, slow, reliable approach
pg_upgrade — the faster in-place upgrade
Logical replication — the near-zero-downtime option

Option 1: pg_dump and pg_restore

Dump everything out, bring up the new version, load it back in.

# Dump the entire cluster
pg_dumpall -U postgres -h localhost -p 5432 > full_dump.sql

# Stop old server, start new one on the same port
sudo systemctl stop postgresql@14-main
sudo systemctl start postgresql@16-main

# Restore
psql -U postgres -f full_dump.sql

For parallelism and compression, use the custom format per-database:

pg_dump -U postgres -Fc -j 4 -d myapp > myapp.dump
pg_restore -U postgres -Fc -j 4 -d myapp myapp.dump

Pros: Simple. Works every time. We get a clean, freshly-organized database.

Cons: The database is fully offline for the entire dump+restore cycle. For a 500 GB database, this can take hours.

When to use: Small databases (under ~50 GB), dev/staging environments, or when we can schedule a generous maintenance window.

Option 2: pg_upgrade

pg_upgrade upgrades in place by transferring the data files directly — either by copying or by hard-linking them.

# Install the new version alongside the old one
sudo apt install postgresql-16

# Stop both clusters
sudo systemctl stop postgresql@14-main
sudo systemctl stop postgresql@16-main

# Run pg_upgrade in check mode first (always do this)
sudo -u postgres /usr/lib/postgresql/16/bin/pg_upgrade \
  --old-datadir=/var/lib/postgresql/14/main \
  --new-datadir=/var/lib/postgresql/16/main \
  --old-bindir=/usr/lib/postgresql/14/bin \
  --new-bindir=/usr/lib/postgresql/16/bin \
  --check

# If checks pass, run with --link for speed
sudo -u postgres /usr/lib/postgresql/16/bin/pg_upgrade \
  --old-datadir=/var/lib/postgresql/14/main \
  --new-datadir=/var/lib/postgresql/16/main \
  --old-bindir=/usr/lib/postgresql/14/bin \
  --new-bindir=/usr/lib/postgresql/16/bin \
  --link

The --link flag creates hard links instead of copying data files. This turns what could be a 2-hour copy into seconds. The tradeoff: we cannot go back to the old cluster once we start the new one (the files are shared), so take a filesystem snapshot or backup first.

After the upgrade:

# Start the new cluster
sudo systemctl start postgresql@16-main

# Run the recommended analyze
sudo -u postgres /usr/lib/postgresql/16/bin/vacuumdb \
  --all --analyze-in-stages

The --analyze-in-stages step is important. Without it, the query planner has no statistics and queries will have poor plans. It runs three passes with increasing accuracy.

Pros: Fast (minutes, not hours). Reliable. Built into PostgreSQL.

Cons: Requires downtime — both clusters need to be stopped. Usually 5-15 minutes, but plan for 30+ to be safe.

Watch out for:

Extensions not available for the new version will block the upgrade. The --check flag catches this.
Custom postgresql.conf settings do not carry over. Diff the configs manually.
If using tablespaces, paths need to be set up correctly on both sides.

Option 3: Logical Replication (Near-Zero Downtime)

Run old and new versions side by side, replicate data in real time, then cut over.

On the old server (publisher):

-- Make sure wal_level is set to 'logical' (requires restart if changed)
ALTER SYSTEM SET wal_level = 'logical';

-- Create a publication for all tables
CREATE PUBLICATION upgrade_pub FOR ALL TABLES;

On the new server (subscriber):

# First, copy the schema (logical replication does not handle DDL)
pg_dump -U postgres -s -d myapp | psql -U postgres -d myapp -h new-server

-- Create a subscription pointing to the old server
CREATE SUBSCRIPTION upgrade_sub
  CONNECTION 'host=old-server dbname=myapp user=replicator password=secret'
  PUBLICATION upgrade_pub;

Once the initial data sync completes and replication lag drops to zero, cut over:

# Verify replication is caught up
psql -U postgres -c "SELECT * FROM pg_stat_subscription;"

# Point your application to the new server
# (update connection strings, DNS, pgbouncer config, etc.)

# On the new server, drop the subscription
psql -U postgres -c "DROP SUBSCRIPTION upgrade_sub;"

Pros: The application stays up throughout. We can validate the new server before cutting over.

Cons: More complex to set up. We need to handle:

Sequences (they do not replicate — sync them manually before cutover)
Large objects (not supported by logical replication)
DDL changes during the migration window (apply to both sides)
Schema differences and extension compatibility

Syncing sequences before cutover:

-- On the old server, grab current sequence values
SELECT sequencename, last_value
FROM pg_sequences
WHERE schemaname = 'public';

-- On the new server, set them (with some headroom)
SELECT setval('users_id_seq', 158432 + 1000);
SELECT setval('orders_id_seq', 891204 + 1000);

Adding a buffer prevents ID collisions if writes occur during cutover.

PostgreSQL in Containers

The concepts are the same, but we swap image tags rather than installing packages side by side.

A common Docker Compose pattern:

services:
  postgres-old:
    image: postgres:14
    volumes:
      - pgdata:/var/lib/postgresql/data
    ports:
      - "5432:5432"

  postgres-new:
    image: postgres:16
    volumes:
      - pgdata-new:/var/lib/postgresql/data
    ports:
      - "5433:5432"

Use pg_dump from one container to the other, or set up logical replication between them. For pg_upgrade, we need both sets of binaries in the same container, which means a custom image.

In Kubernetes with operators like CloudNativePG or Zalando's postgres-operator, major upgrades usually involve spinning up a new cluster on the target version and migrating data via pg_dump or logical replication.

What I'd Recommend

For most teams, pg_upgrade with --link is the sweet spot. It is fast, battle-tested, and the downtime is short enough for a maintenance window. I've used it on hundred-gigabyte databases with under 20 minutes of total downtime.

Logical replication is powerful but it is also where I've seen the most issues — forgotten sequences, schema drift, extension differences. Use it when we genuinely need near-zero downtime and have the bandwidth to test thoroughly.

Regardless of approach:

Test the upgrade on a copy first. Restore a backup to a throwaway server and run the full procedure.
Take a backup right before starting.
Do not skip the analyze step. Poor query plans after an upgrade cause performance issues that are hard to diagnose.
Read the release notes for every major version we are skipping. Breaking changes accumulate.