AWS Migration Considerations Series: Managing Finances
Evaluating the long-term financial impact of cloud migration: cost savings vs. migration effort.
7 minutes
2nd of July, 2024
One of the biggest driving factors many look to in a cloud migration is cost savings. Many take a business case that has to meet a financial benefit. However, the effort to do the migration is often an overhead that counterweighs the benefit over a 3—or 5-year period.
The On-Premises Cost Complexity
An existing collocation data center often has costs that are never truly accounted for or attributed to each workload. For example, a typical data center deployment will have routing and switch gear deployed to connect the hardware. The annual maintenance fees on this hardware are often a hidden cost. A storage area network (SAN) that gets replaced after five years is the cost of standing still. It is a maintenance overhead that does not progress your organization forward, but the lack of that maintenance is a risk.
For the same product, the on-premises costs generally trend up over time. Inflation is applied (and accepted) against many of these costs.
Then, generational changes happen. Tape drives get replaced with newer ones, which write a different format of tape but cannot read back more than one generation of tape. If these generational changes occur more than every seven years, then multiple tape drives need active operation to cater to potential restores from old backup media.
The Double-Bubble Costs
Any migration from the old infrastructure or service to the new, such as a SAN or server migration, is never a zero-sum cost. These refresh projects involve project teams and data center deployments that all take dollars to drive before any benefit is realized.
The overlap in expenditure is often called the double bubble: you are paying for the new infrastructure while not using it yet. That overlap could last days to months. The point of a go-live swap-over may be downtime, a redeployment, or a massive data copying effort before going live. Costs for the existing infrastructure remain while it consumes power, maintenance effort, and rack space.
Often, the prompt for a replacement is purely time. A vendor's support fees usually increase as the equipment ages, and after the fifth year, they are often more expensive than replacing the associated hardware.
If we look at the fictitious total cost of operation of a SAN, a replacement equivalent, a project team, and other costs to implement a swap over, then over time, the costs look like the graph below.
The project team started planning, selecting, and ordering equipment before the hardware costs. Then, the new SAN costs kick in, even though it is not live. It is delivered, racked and stacked, powered, burnt in, initial maintenance, failure testing, etc. Next comes a period of data migration in one of several patterns.
Lastly, the project team starts to decommission, unrack, and dispose of the old SAN. The duration and magnitude of the double bubble may vary. However, all those costs have got you back to where you started with no significant competitor advantage for the spend.
If we were using cloud-based services, the cost would only be incurred as we start consuming services, not all up-front.
We also see cloud costs more closely matched to the workload’s requirements. SANs are often replaced at 75% or 85% capacity (given the lead time and growth rate of storage consumption), with a typical run time utilization of around 40%—50% as a target utilization.
Cloud, however, is very happy to run hot at 85% utilization, as the time to provision more is typically instantaneous and, in some cases, automated (see RDS Auto Scaling storage, new in 2019).
Given that the cloud manages the physical layer, you do not have to simultaneously replace major operational components that affect or provide service to your entire estate. It's fine-grained. With no large replacement, there is no large project team to do this. These activities become a business-as-usual (BAU) activity for an operations team.
Costs over Time
On-premise, we see that costs trend upward over time. Power, collocation space, and wages all take inflation into account. Yet many costs in the cloud have trended downwards: bandwidth, storage, and compute power.
In 2013, the bandwidth cost from some of the largest telecommunication companies in Australia was in the order of dollars per gigabyte transferred. AWS launched in Sydney in 2013 at US$0.19c/GB at the top tier for data egress. In January 2021, that top price sat at US$0.114c/GB, a 40% reduction.
The price drops happened automatically and immediately for all customers without needing action. The lower price just came into effect—no migration project, no renumbering of IP addresses, just cheaper.
The same applies to object storage in S3: costs decrease over time with no customer action required.
Similar results are seen on compute, but with some minor customer effort. Not only does the existing cost of compute sometimes reduce over time, but newer compute instance types are reduced with a lower raw cost. There is also increased performance, with the caveat that the customer may need to look at updating the virtual machine's operating system and migrating to the newer instance family over time.
This is also true with block storage: in 2020, the GP3 SSD-based block storage was 20% cheaper than the GP2 SSD, but again, customer action is required to adopt the new block storage type.
Cloud Cost Non-Tangibles
One of the hardest costs to account for is the efficiency improvement in using the cloud if done right with well-trained people and a supportive company culture. Here is where the history of large capital investment projects in the past gets in conflict with business agility and tiny changes being implemented by teams.
Overestimating requirements was a natural phenomenon in on-premises deployments. The amount of SAN storage purchased now had to last at least the next three years, so you would start paying for it upfront on day one.
The amount of compute deployed would have to meet your busiest day and any surprises that may happen beyond that. While overegging an estimate to be 20% utilized means there is plenty of headroom, it also means 80% wastage and overspending.
Key ingredients in this pattern were long lead times to correct under-provisioning and the cost review and sign-off procedure required within organizations.
With modern DevOps teams taking full lifecycle responsibility for the data, we see a growing trend in giving those teams the cost control and mechanisms to provide real-time resources within limits. Those teams better understand the required infrastructure than a procurement team. Some intelligent KPIs can enthuse a team to ensure that cost-effectiveness is a continual focus.
Managed versus Self-Managed Components
One such consideration is the mix of self-managed versus managed solutions, such as databases. The managed database offerings from AWS come with a slight cost uplift, even for the open source-based free databases, for the same compute and storage.
However, the ability to uniformly administer a larger fleet with less effort reduces the administrator (staff) overhead costs. Reducing configuring and monitoring replication and snapshots is often worth the slight cost increase alone.
This pattern repeats throughout a cloud deployment: the more managed services you can leverage, the simpler your problem of maintaining a workload is. It is worth paying attention to the residual management that still exists. These solutions do not replace 100% of what your operations teams currently do, but they do make significant indents.
Staffing Ratios
Looking at the staffing overheads, it is worth considering the organization's team structure. A DBA manages five databases of a particular type, such as Oracle. This DBA works with a Systems Administrator who configures the server fleet. A Storage Administrator provides the SAN storage, and a Network Administrator integrates everything. Often, a single DevOps Engineer can replace this entire team and administer many more RDS databases.
With that one person able to self-service reliably and a deep understanding of the workload, they should be empowered to use all means at their disposal to ensure the workload functions optimally.
Akkodis has been an AWS Consulting Partner since 2013. Learn more about our AWS Practice and services.
By James Bromberger, VP Cloud Computing, Akkodis Australia