Enterprise Systems
The AWS Migration Checklist for Moving Without Downtime
Updated June 2026 · 10 min read · by Brian
Most teams approach a cloud move backwards. They obsess over the server lift and treat the data and the cutover as afterthoughts, when those are exactly the two steps that put your business at risk. Moving a stateless web server to EC2 is mechanical. Moving a production database that customers are actively writing to, and then redirecting live traffic without dropping a transaction, is where migrations succeed or fail. This AWS migration checklist is built around that reality. It walks through assessment, strategy selection, data migration, and a cutover plan that keeps you online, with the security, networking, and cost work that turns a successful move into a stable one.
Start With an Honest Assessment and Inventory
You cannot migrate what you have not mapped. Before anyone touches AWS, build a real inventory of every application, server, database, scheduled job, integration, and dependency in your current environment. The goal is not a pretty diagram. It is to surface the things that will bite you mid-migration: the undocumented cron job that emails the finance team, the hardcoded IP address buried in a config file, the legacy app nobody wants to own.
For each workload, capture what it does, what it talks to, how much data it holds, its performance baseline, and its tolerance for downtime. That last point matters more than people expect. A reporting database can be offline for an hour at 2am. An order-processing system cannot. Knowing the real recovery time objective and recovery point objective for each system tells you how much cutover engineering each one actually justifies.
- Inventory every server, database, application, and scheduled job, including the ones nobody documented
- Map dependencies and data flows so you know what breaks when a component moves
- Record a performance baseline (CPU, memory, IOPS, throughput) to size AWS resources accurately
- Define the downtime tolerance, RPO, and RTO for each workload individually
- Flag licensing, compliance, and data residency constraints before they become blockers
- Identify hardcoded IPs, hostnames, and credentials that will not survive a move
Pick a Migration Strategy: The 'R's in Plain English
Not every workload should move the same way. AWS describes a set of migration strategies often called the 'R's, and choosing the right one per application keeps the project realistic instead of turning every move into a rewrite.
Rehost, often called lift and shift, means moving a workload to AWS roughly as-is, typically onto EC2 with the same operating system and software. It is the fastest path and the lowest risk to application behavior, because almost nothing changes except where the machine runs. Replatform means making targeted improvements during the move without re-architecting, the classic example being shifting a self-managed database onto Amazon RDS so AWS handles patching, backups, and failover. Refactor means rewriting parts of the application to use cloud-native services, such as breaking a monolith into containers or serverless functions. Refactoring delivers the most long-term value and carries the most cost and risk, so it should be a deliberate choice, not the default. For most first migrations, rehost the bulk, replatform the databases that benefit clearly, and defer refactoring until you are stable on AWS.
- Rehost (lift and shift): move as-is to EC2, fastest and lowest behavioral risk
- Replatform: targeted upgrades like moving a database to RDS without re-architecting
- Refactor: rewrite into cloud-native services for long-term value, highest cost and risk
- Retire workloads that are no longer used instead of paying to migrate them
- Retain anything that should stay put for now, and plan a hybrid connection to it
Plan the Data Migration Carefully
Data is the riskiest part of any migration, and it is where most of the engineering effort belongs. The core problem is that your source database keeps changing while you are copying it. A simple export and import works fine for a static dataset, but for a live production system it guarantees stale data and a long outage. That is why serious migrations separate the initial bulk load from ongoing change replication.
The pattern that avoids downtime is a full load followed by continuous replication. You copy the existing data once, then stream every subsequent change from source to target until the two are in sync and stay in sync. AWS Database Migration Service supports exactly this for many engines, including heterogeneous moves where the source and target are different database platforms. Whatever tool you use, the non-negotiable step is validation: confirm that row counts, checksums, and spot-checked records match before you trust the target with live traffic. Migrating the data is only half the job; proving it arrived intact is the other half.
- Estimate data volume and transfer time honestly, including large objects and historical archives
- Do a full initial load, then enable continuous replication to capture ongoing changes
- Use AWS DMS or native replication tools, especially for cross-engine migrations
- Validate with row counts, checksums, and record-level spot checks before cutover
- Test the restore path, not just the backup, so you can recover if something goes wrong
Build a Cutover Plan That Avoids Downtime
Cutover is the moment traffic moves from the old environment to AWS, and it is where downtime hides. The principle that prevents an outage is simple: get the target fully ready and synchronized before you redirect a single user. With continuous replication keeping the AWS database current, the actual switch becomes a brief, controlled redirection rather than a frantic copy-and-pray.
Run the old and new environments in parallel for a period so you can compare behavior and catch problems while the original is still serving traffic. When you are confident, shift traffic gradually rather than all at once. DNS-based shifting with a low time-to-live lets you redirect users to AWS and roll back quickly if needed, though you must account for DNS caching. A load balancer or weighted routing gives you finer control, letting you send a small percentage of traffic to the new environment first and watch it before committing the rest. Above all, write the rollback plan before cutover day and rehearse it. A migration without a tested rollback is a bet, not a plan.
- Keep the AWS target continuously synchronized so cutover is a switch, not a copy
- Run old and new in parallel and compare behavior before committing
- Lower DNS TTL ahead of time so traffic shifts and rollbacks propagate fast
- Shift traffic gradually using weighted routing or a load balancer, watching metrics at each step
- Write and rehearse the rollback plan before cutover day, with clear go/no-go criteria
- Schedule the switch for a low-traffic window and freeze unrelated changes around it
Get Security, IAM, and Networking Right
Cloud security failures are rarely exotic. They are usually an over-permissive IAM policy, an open security group, or a public storage bucket that should have been private. Build least-privilege access from the start, because retrofitting it after launch is painful and often gets skipped. Give each service and person only the permissions they need, use roles instead of long-lived access keys wherever possible, and turn on logging so you can answer who did what later.
Networking deserves the same care. Design your VPC, subnets, and security groups deliberately rather than accepting defaults, isolate databases in private subnets with no direct internet exposure, and encrypt data both at rest and in transit. If you are running hybrid during the migration, the connection between your existing environment and AWS needs to be secure and reliable, since your replication stream and possibly live traffic depend on it.
- Apply least-privilege IAM from day one, using roles over long-lived access keys
- Enable CloudTrail and relevant logging so actions are auditable
- Design VPC subnets and security groups intentionally; keep databases in private subnets
- Encrypt data at rest and in transit, and manage keys deliberately
- Secure the hybrid connection that carries replication and any live traffic during migration
Control Cost After You Land
AWS bills for what you provision, not what you use, so a lift-and-shift that mirrors oversized on-premise servers will quietly overspend every month. The fix is not to under-provision before cutover, when stability matters most, but to right-size deliberately once you have real usage data. Watch the workloads for a few weeks, then adjust instance sizes and storage to match actual demand.
Once usage is predictable, commit to it. Savings Plans and Reserved Instances trade a usage commitment for a meaningful discount over on-demand pricing. Set up billing alerts and cost monitoring early so a runaway resource surfaces in days, not at the end of the month. And clean up the migration debris: the temporary instances, extra snapshots, and replication infrastructure that were essential during the move and pure waste afterward.
- Right-size instances and storage after observing real usage, not before cutover
- Use Savings Plans or Reserved Instances once demand is predictable
- Set billing alerts and budgets so overspend surfaces early
- Delete temporary migration resources, idle instances, and orphaned snapshots
- Tag resources consistently so cost is attributable to teams and workloads
Validate Before You Call It Done
A migration is not finished when the server boots in AWS. It is finished when you have proven the workload performs correctly under real conditions and you have a way to know when it does not. Walk through the application end to end the way a user would, confirm integrations and scheduled jobs still fire, and check performance against the baseline you captured during assessment. Slower-than-before is a defect, not a quirk to accept.
Set up monitoring and alerting before you decommission anything, so you are watching the new environment rather than assuming it is fine. Keep the old environment available, in a state you can fall back to, for a defined window after cutover. Only when the new environment has run cleanly through a full business cycle should you tear the old one down. That discipline is the difference between a migration that looks done and one that actually is.
- Run end-to-end functional tests as a real user would, including integrations and batch jobs
- Compare performance against the pre-migration baseline and treat regressions as defects
- Stand up monitoring, dashboards, and alerts before decommissioning the source
- Keep the old environment recoverable for a defined window after cutover
- Decommission the source only after a clean full business cycle on AWS
Frequently asked
- How long does an AWS migration take?
- It depends entirely on scope and complexity, so be wary of anyone quoting a fixed timeline before seeing your inventory. A handful of stateless rehosted servers can move in days. A large estate with several production databases, tight downtime constraints, and refactoring work can run for months. The assessment phase exists precisely to turn that uncertainty into a credible schedule.
- Can I really migrate with zero downtime?
- For most workloads you can get to near-zero downtime, often a brief switchover measured in seconds to minutes, by using continuous data replication and gradual traffic shifting so the target is fully ready before users move. True absolute-zero downtime for a stateful system is hard and expensive, so it is worth being precise about what each workload actually requires rather than over-engineering everything.
- Should I lift and shift first or refactor during the migration?
- For most first-time migrations, rehost the bulk of the estate and replatform the databases that clearly benefit, then defer refactoring until you are stable on AWS. Refactoring mid-migration combines two hard projects into one and multiplies the risk. Move first, optimize second, once you understand how the workloads behave in the cloud.
- What is the riskiest part of an AWS migration?
- Data migration and cutover, not the server lift. Copying a live, changing database without losing transactions and then redirecting production traffic without an outage is where migrations actually go wrong. That is why most of the engineering effort and rehearsal should concentrate on those two steps, and why a tested rollback plan is non-negotiable.
- How do I keep AWS costs under control after migrating?
- Right-size after you have real usage data rather than mirroring oversized on-premise hardware, then commit to predictable demand with Savings Plans or Reserved Instances. Set billing alerts early, tag resources so cost is attributable, and delete the temporary instances and snapshots left over from the migration itself.
More guides

