Key Points
- AWS cost optimization needs to start with ownership and visibility: Teams cannot reduce costs safely or explain spending without clear tagging, owners, and baselines.
- Eliminate waste before committing to long-term discounts: Shut down unused resources and remove unneeded services before using Savings Plans or Reserved Instances.
- Cost savings should never come at the expense of reliability. Changes to storage, scaling, or data transfer must be tested and monitored to avoid outages and SLA impact.
- Ongoing governance prevents costs from creeping back: Budgets, alerts, reviews, and undocumented decisions keep optimization effective as environments and usage change over time.
AWS cost optimization best practices are generally well known, and AWS provides a number of tools that help you implement them. However, recklessly applying cost-saving measures can negatively impact the reliability of the services you provide, leading to user dissatisfaction and repercussions for your IT business.
This guide includes advice to help tech teams and managed services providers (MSPs) address the challenges of balancing AWS cost optimization with service reliability, prioritizing understanding the value you offer, ownership of different components, and ongoing AWS cloud governance.
What you need to follow AWS cost optimization recommendations
To effectively control your AWS spend and reduce costs without impacting the service excellence that sustains and grows your business, you’ll need:
- A system of tags that can be assigned to deployed AWS resources that will categorize, assign ownership, and enable oversight
- Baseline telemetry for utilization and traffic patterns, as well as any existing service level objectives (SLOs)
- Access to billing data, service metrics, and the AWS cost explorer
- Regular meetings with stakeholders to review budgets, anomalies, and proposed architectural changes
It is reported that over 30% of IT leaders waste 50% of their cloud spend, making it critical that you regularly review your cloud usage to ensure that you are not spending money on unused services or wasting resources maintaining them.
AWS cost optimization best practices
By recognizing the methods provided in this guide, you’ll see the following broad benefits for each best practice implemented:
| AWS cost optimization best practice | Purpose | Value delivered |
| Mandatory resource tagging | Provides organization, ownership, and accountability | Clear owners and budgets |
| Rightsizing before committing to savings plans | Gives you an accurate baseline for cost optimization | Allows you to recognize potential savings without risking committing to long-term plans |
| Planning your storage lifecycle | Reduces overall data storage and the volume of data stored on more expensive tiers | Less volatile storage costs as your requirements scale |
| Traffic locality and egress control | Keeps data in the same AZ or VPC | Reduced costs and fewer surprise bills from busy services |
| Guardrails and approvals/reviews | Prevents mistakes | Savings measures don’t need to be rolled back because they impacted reliability |
Note that these best practices are broadly applicable: every organization has different legal, business, and customer requirements. You should prioritize service delivery before cost optimization and make sure any best practices and advice followed are compatible with your business goals and legal compliance.
Let’s take a closer look:
Use tagging to establish cost ownership
Perform a manual tech stack review of your AWS resources. In addition to helping to identify overlap that can be eliminated to reduce waste without affecting performance or reliability, this will give you the information you need to tag each resource with an owner, environment, associated service that you provide, and its cost center/department. Then, map services to budgets and establish ownership of the costs associated with each service.
Eliminate waste by rightsizing
AWS waste often comes in the form of idle or low-utilization instances and databases, unnecessary load balancers and elastic IPs, and unused EBS volumes and snapshots (for example, stale backups).
Much of this happens unseen as usage patterns change, so ongoing monitoring using Amazon CloudWatch is key to identifying resources that are no longer necessary or are over-provisioned. Once identified, these resources can be deprovisioned, rightsized, or powered down – without impacting the reliability of the associated service you provide to your users.
Further waste can be eliminated by identifying predictable usage patterns and optimizing for them: for example, you may power down non-production instances outside work hours when they are not expected to be used.
Optimize backup lifecycles and storage
AWS S3 object storage provides affordable storage for vast quantities of data, but that doesn’t mean it doesn’t accumulate unnecessary costs that can make a significant impact on IT budgets.
Choosing the right storage tier and leveraging the S3 lifecycle to automatically move data to cheaper archival tiers and delete stale or redundant copies can drastically reduce your long-term cloud spend.
When selecting the storage class for workloads, don’t just pick the default — carefully assess your requirements and pick the most cost-effective option that meets performance and reliability requirements for that specific job, including how snapshots and retention are configured. Within each job itself, compress and deduplicate your data.
Intelligent Tiering is an AWS feature that can assist with this for S3-hosted data, as it is able to intelligently identify usage patterns and automatically move data to the most cost-effective access tier that meets performance expectations.
Reduce data transfer and egress fees
Data transfer and egress are often overlooked when planning AWS budgets. Chatty services should be kept to the same availability zone or VPC to reduce these costs. Egress fees can be reduced by optimizing your workloads to reduce the volume or frequency of data that needs to be transferred to and from AWS over the internet, and further optimized by utilizing caching and CDNs for public-facing services.
Cross-region data transfer should be minimized, keeping in mind that cross-region replication may be necessary for RPO/RTO or compliance reasons.
Tune your cloud architecture for efficient scaling
The usefulness of the AWS Cost Explorer cannot be understated here: once your services are running reliably, you can use it to find out exactly what AWS components are charging you, set baselines, and begin optimizing. Combined with tagging, the reports generated in the AWS Cost Explorer can help make sure that any reliability or performance tradeoffs are explicit and show where the biggest savings are being made.
For services with unpredictable, spiky usage patterns, autoscaling can help reduce costs. Serverless, event-driven functions can also further reduce costs as you only pay for them while they are being used.
Load testing will ensure that reliability is not affected by architectural changes.
Commit to pricing models carefully
AWS Savings Plans and Reserved Instances can provide huge savings over using on-demand instances, however, to get the best savings, you need to commit to them long-term. This means that you are locked into using a certain amount of (and type of) compute, or a specific instance size for the duration of the plan. Only once steady usage patterns have been established is it advised to purchase a plan, to prevent over- or under-provisioning long-term and having to either give up cost optimizations or reliability.
When Savings Plans or Reserved Instances expire, review whether they need to be adjusted before re-committing.
Spot Instances are another AWS feature that can greatly reduce costs: these use ‘spare’ compute in the AWS platform and can be up to 90% cheaper than on-demand instances. However, if AWS needs the compute capacity back, your instance will be shut down with only minutes’ notice, making them only suitable for stateless, fault-tolerant workloads or workloads with built-in checkpointing that does not rely on instances persisting.
Govern and lower AWS cost of ownership with budgets, guardrails, and evidence
AWS lets you set alerts for costs, usage, utilization, anomalies, and more. Ensure that alerts are configured to alert the owner of the associated service so that they can assess whether a service’s reliability is being impacted by cost-saving measures or whether new optimizations have been identified.
Implement policies that deter the use of costly defaults, and regularly review cost data to identify potential AWS cost optimizations. Automation can be leveraged to automatically collect this information, format it, generate recommendations, and publish it for review.
IT cost optimization and AWS cloud governance
Keeping a constant eye on AWS costs requires constant oversight that can become impractical and easily fall by the wayside as other, more urgent IT management and support tasks come up. If AWS cost management becomes a blind spot, bills can quickly rise. If this is not addressed quickly, inefficient baselines and expectations can be set for costs.
Once it is realized that AWS cloud governance duties were neglected and money and resources were spent on buying and maintaining unused AWS infrastructure, stakeholders will demand accountability or may seek alternative service providers. This makes it critical that AWS oversight is included in your IT governance plan.
How to keep AWS costs down with ongoing automation powered by NinjaOne
NinjaOne unifies all of your cross-cloud and on-premises resources under one management platform. Remote monitoring and management, endpoint management, on-site and SaaS backup, and documentation all come together in a single web interface with built-in automation.
This can assist with AWS cost optimization and ongoing AWS cloud governance, giving you tools to script the ingestion of cost and utilization data, flag idle resources, and propose changes by raising and escalating tickets. Reports can be automatically generated that estimate savings and integrate change records, which are then stored in secure cloud-based documentation for ready access and review.
