Every PostgreSQL migration eventually hits the same fork in the road. The database is small enough to dump and restore in a maintenance window, or it isn't. Once you cross into terabyte territory, or the business can't afford more than a few minutes of downtime, the simple approach stops being an option and you have to think harder. Here are the three paths that actually work, and when each one makes sense.
1. Dump and restore, the reliable baseline
For anything under a couple hundred GB, pg_dump piped into pg_restore is still the cleanest way to move a database. It carries the full schema, data, indexes, and sequences, and on decent hardware a mid-size database finishes in well under an hour. The one flag people forget is parallelism.
# dump in the directory format so restore can run in parallel pg_dump -Fd -j 8 -f /dump/appdb sourcedb # restore with the same worker count on the target pg_restore -Fd -j 8 -d appdb /dump/appdb
The catch shows up with one dominant table. Parallelism here is per-table, so a single 2 TB table becomes one long serial stream while the other seven workers sit idle. That is the moment dump and restore starts feeling slow, and the reason the next two options exist.
2. Native logical replication for a live cutover
When downtime is the real constraint, PostgreSQL's built-in logical replication lets the target keep pace with the source while the application stays online. You publish on the old server, subscribe on the new one, and Postgres copies the initial snapshot and then streams every change until you decide to switch over.
-- on the source (wal_level must be 'logical') CREATE PUBLICATION migration_pub FOR ALL TABLES; -- on the target CREATE SUBSCRIPTION migration_sub CONNECTION 'host=old-db port=5432 dbname=appdb user=repl password=...' PUBLICATION migration_pub;
Two things save you real pain on big datasets. First, tables without a primary key need a REPLICA IDENTITY FULL or a unique index, otherwise updates and deletes won't replicate. Second, that same large table problem returns. Logical replication syncs each table with its own worker, so your biggest table still copies as a single stream. A common workaround is to pre-load the huge table with pg_restore, then attach it to the subscription with copy_data = false so replication only handles the incremental changes.
3. Managed DMS when you want the plumbing handled
Both AWS and Google Cloud offer a managed migration service that wraps logical replication for PostgreSQL, including from on-premises sources. Google's Database Migration Service supports CDC via pglogical and can open multiple parallel subscriptions across tables during the initial load and the ongoing CDC phase. AWS DMS does the same using logical replication slots with the test_decoding or pglogical plugin. Both handle slot management and give you a controlled cutover step.
The shared limitation is worth knowing before you commit: for tables without a primary key, both services replicate the initial snapshot and INSERT operations only. UPDATE and DELETE changes during CDC are dropped and need to be handled separately. Beyond that, parallelism is still table-level on both platforms, so the single dominant large table remains the bottleneck either way.
How to actually choose
Small database and a maintenance window you can spare? Dump and restore, done. Need the source online with minimal downtime and comfortable managing slots yourself? Native logical replication. Want the same low-downtime result without babysitting the replication setup? Reach for the managed DMS on your target cloud. In all three cases the single dominant table is what decides your timeline, so measure it first and plan to pre-seed it on its own rather than hoping a migration tool will parallelize what the engine simply won't.
No comments:
Post a Comment