Skip to main content

Command Palette

Search for a command to run...

Zero‑Downtime PostgreSQL Migration Using Logical Replication

Updated
4 min read
Zero‑Downtime PostgreSQL Migration Using Logical Replication

Migrating a production database is often a high-stakes operation. Traditional "dump and restore" methods require significant maintenance windows, leading to service downtime. To solve this, PostgreSQL offers Logical Replication, a native mechanism that allows data to stream from a source to a destination in real-time.

How Logical Replication Works

At its core, logical replication leverages the Write-Ahead Log (WAL). While physical replication copies disk blocks, logical replication decodes the WAL into row-level changes (INSERTs, UPDATEs, and DELETEs).

  1. The Publisher: On the source database, you define a Publication. This serves as a set of change sets for specific tables.

  2. The Subscriber: On the destination database, you create a Subscription. The subscriber connects to the publisher, pulls an initial snapshot of the data, and then continuously "replays" ongoing changes.

  3. The Replication Slot: The publisher uses a replication slot to ensure that WAL files are not deleted until the subscriber has confirmed receipt. This prevents data loss during transient network failures.

Step 1: Configuring the Source (Publisher)

Before starting, ensure your wal_level is set to logical in your postgresql.conf. For managed db this can be done by setting some flag from your cloud provider database management page.

To begin, connect to your source database and create a publication. You can choose to replicate the entire database or specific tables:

-- create publication for all tables
CREATE PUBLICATION <publication_name> FOR ALL TABLE;

-- create publication only for some tables
CREATE PUBLICATION <publication_name> FOR TABLE tableA, tableB, tableC;

-- Digital Ocean Managed Postgres Instance
SELECT aiven_extras.pg_create_publication_for_all_tables('<publication_name>', 'INSERT,UPDATE,DELETE,TRUNCATE')

-- Verify the publication
SELECT * FROM pg_publication;

Step 2: Preparing the Destination (Subscriber)

Important: Logical replication does not replicate schema definitions (DDL). You must ensure the table structures exist on the destination database before starting the subscription.

Once the schema is ready, connect to the destination and initiate the subscription:

CREATE SUBSCRIPTION <subscription_name_here> 
    CONNECTION 'dbname=<database_name_here> host=localhost port=5432 sslmode=disabled user=postgres password=<password_here>' 
    PUBLICATION <publication_name> WITH (slot_name = '<slot_name_here>', create_slot = true, copy_data = true);

Managed Service Nuances

If you are using managed PostgreSQL services, the standard CREATE SUBSCRIPTION command may be restricted due to permissions:

SELECT * FROM aiven_extras.pg_create_subscription(
    '<subscription_name_here>', 
    'dbname=<database_name_here> host=localhost port=5432 sslmode=disabled user=postgres password=<password_here>', 
    '<publication_name>', 
    '<slot_name_here>', 
    TRUE, TRUE
);
  • Google Cloud SQL: Ensure the cloudsql.logical_decoding flag is set to on.

Step 3: Monitoring the Migration

The migration happens in two phases: the initial sync (copying existing data) and streaming (applying delta changes). You can monitor the progress with these queries:

SELECT 
    n.nspname || '.' || c.relname AS table_name,
    sr.srsubstate AS state,
    CASE sr.srsubstate
        WHEN 'i' THEN 'initialize'
        WHEN 'd' THEN 'data copying'
        WHEN 's' THEN 'synchronized' 
        WHEN 'f' THEN 'finishedcopy'
        WHEN 'r' THEN 'ready'
    END AS state_description
FROM pg_subscription_rel sr
JOIN pg_class c ON c.oid = sr.srrelid
JOIN pg_namespace n ON n.oid = c.relnamespace;

-- to see the ongoing connections
SELECT * FROM pg_stat_activity;

Monitor Replication Lag

To ensure the destination is caught up before you perform the final cutover, check the lag:

-- for checking current slot progress
SELECT
  slot_name,
  plugin,
  slot_type,
  active,
  confirmed_flush_lsn,
  pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) AS slot_lag_bytes
FROM
  pg_replication_slots
WHERE
  slot_type = 'logical';

Step 4: Final Cutover and Cleanup

Once the lag is near zero, point your application services to the new destination database. After the migration is confirmed successful, clean up the replication resources to free up system overhead.

  1. Drop the Subscription:

     -- Standard
     DROP SUBSCRIPTION <subscription_name>;
    
     -- for digital ocean managed postgres instance
     SELECT * FROM aiven_extras.pg_drop_subscription('<subscription_name>');
    
     -- If you somehow dropped the publication before removing subscription drop command will get stuck trying to remove slot on publisher
     ALTER SUBSCRIPTION mysub DISABLE;
     ALTER SUBSCRIPTION mysub SET (slot_name = NONE);
    
  2. Drop the Publication (on Source):

     -- Finally drop the publication
     DROP publication <publication_name>;
    
  3. Remove the Slot (if it persists):

     -- slots should be automatically deleted but in case you want to manually delete one
     SELECT pg_drop_replication_slot('<slot_name>');
    

Sequences are Not Replicated!

Logical replication only copies data changes. It does not keep sequences (used for auto-incrementing IDs) in sync. You can use the following query to get the current sequence values

-- Get all the sequences and their respective value
SELECT t.sequence_name, nextval(t.sequence_name::text) - 1 FROM (SELECT sequence_schema, sequence_name
FROM information_schema.sequences) AS t;

-- Set the value
SELECT setval('<sequence_name>', <value_here>);

Conclusion

Logical replication is an elegant solution for moving data with minimal impact on users. By following these steps—Publication, Subscription, and Monitoring—you can migrate gigabytes or terabytes of data while keeping your application online.