Q1 is the best time to run a comprehensive cloud cost audit. Budgets are set, teams are staffed, and there's no better time to identify and eliminate waste before it compounds through the rest of the year. This checklist covers all 47 optimization actions we recommend running through before Q2 kicks off — organized by category with estimated savings ranges for each.

We've built this from real analysis of hundreds of AWS accounts at Cloud Hero AI. Every item on this list has generated real savings for real teams. Not every item applies to every environment, but if you go through all 47 and find 20–30 that apply to you, you'll capture the majority of your optimization opportunity.

47
Optimization actions in this checklist
25–40%
Typical bill reduction after full checklist
30 days
Time to implement and see full savings

How to use this checklist: Work through each section systematically. Mark items that apply to your environment and create Jira/Linear tickets for each. Prioritize by estimated savings — the items with the highest dollar impact should be tackled first. Complete each item before moving to the next within a section, as some optimizations depend on others (e.g., rightsize before purchasing Reserved Instances).

Compute (12 Items) Potential: 15–30% of total bill

  • 1. Identify and terminate all stopped EC2 instances — Stopped instances don't charge for compute but EBS volumes keep billing. Audit with aws ec2 describe-instances --filters Name=instance-state-name,Values=stopped. For each: take a final AMI, terminate the instance. Savings: $50–$500/month per cluster of stopped instances.
  • 2. Run EC2 rightsizing analysis in Cost Explorer — Navigate to Cost Explorer → Recommendations → Rightsizing. Review all flagged instances. Any instance with <20% average CPU utilization is a candidate for one size down. Savings: 30–50% on downsized instances.
  • 3. Evaluate Graviton migration for top 5 instance families — AWS Graviton3 (m7g, c7g, r7g) delivers 20% better price-performance than x86 equivalents. Identify your top EC2 instance families by spend and test Graviton for each. Savings: 15–20% per migrated instance family.
  • 4. Implement Auto Scaling for web/API tiers — Any EC2 fleet serving variable traffic should be behind an Auto Scaling Group. Manual scaling to peak capacity 24/7 wastes 30–60% of compute on off-peak hours. Savings: 20–40% on applicable instance groups.
  • 5. Enable Delete-on-Termination for EBS root volumes — Change the default EC2 block device mapping so volumes are deleted when instances terminate. Prevents future orphaned volume accumulation at the source. Savings: Prevents future waste; typically $50–$300/month ongoing.
  • 6. Review EC2 instance families for generation currency — Running m4 or m5 instances? The m7g (Graviton3) or m6i/m7i are faster and cheaper. Old generation instances can cost 10–20% more than current generation equivalents. Savings: 10–20% per upgraded instance.
  • 7. Audit Lambda function memory configurations — Lambda charges per GB-second. Many Lambda functions have more memory allocated than needed. Use AWS Lambda Power Tuning (open source) to find the optimal memory configuration. Savings: 10–40% on Lambda spend for over-provisioned functions.
  • 8. Evaluate Fargate Spot for non-critical ECS tasks — Fargate Spot is 70% cheaper than standard Fargate. Background jobs, batch processing, and dev/staging ECS tasks can typically run on Spot with minor configuration changes. Savings: 70% on eligible Fargate tasks.
  • 9. Schedule non-production EC2 instances to stop nights and weekends — Dev and staging environments don't need to run 24/7. A simple scheduler that stops instances at 7pm and starts them at 8am on weekdays cuts those instances' compute cost by 73%. Savings: 70–73% on non-production compute.
  • 10. Review Spot Instance usage for batch and stateless workloads — On-demand instances for batch processing, data pipelines, or stateless API workers are overpaying by 60–80%. Evaluate Spot for these workloads with appropriate fault tolerance. Savings: 60–80% on eligible workloads.
  • 11. Identify and terminate zombie EC2 instances with no traffic — EC2 instances with zero inbound traffic for 30+ days via CloudWatch NetworkIn metric are zombie candidates. These are running, not stopped, but serving no purpose. Savings: 100% of those instances' hourly cost.
  • 12. Check for oversized launch template defaults — If your Auto Scaling Groups have a launch template with an oversized instance type, every new instance is over-provisioned by default. Audit ASG launch templates and update them to reflect your rightsizing analysis. Savings: 20–50% on ASG-launched instances.

Storage (8 Items) Potential: 5–12% of total bill

  • 13. Find and delete all unattached EBS volumes — Run aws ec2 describe-volumes --filters Name=status,Values=available. Snapshot any you're unsure about, delete the rest. Savings: $0.08–$0.125/GB/month per deleted volume.
  • 14. Migrate all gp2 EBS volumes to gp3 — gp3 is 20% cheaper than gp2 at the same size and provides better baseline performance. Zero-downtime migration via the AWS console or CLI. Savings: 20% on all migrated EBS storage.
  • 15. Implement EBS Snapshot Data Lifecycle Manager policies — Set automated retention policies for all EBS snapshots. Most teams need 7–30 days for operational recovery and 90 days for compliance. Anything beyond without a policy wastes money. Savings: $50–$2,000+/month depending on snapshot accumulation.
  • 16. Archive old snapshots to EBS Snapshot Archive — Snapshots you must retain (compliance) but rarely access can be moved to EBS Snapshot Archive at 75% cost reduction. Savings: 75% on archived snapshots vs standard snapshot storage.
  • 17. Implement S3 lifecycle policies on all buckets — Transition objects to Standard-IA after 30 days, Glacier Instant Retrieval after 90 days. For compliance archives: Glacier Deep Archive after 365 days. Savings: 50–95% on transitioned objects vs S3 Standard.
  • 18. Enable S3 Intelligent-Tiering for new buckets — For buckets with unpredictable access patterns, Intelligent-Tiering automatically moves objects between tiers. The $0.0025/1,000 objects monitoring fee pays for itself immediately for buckets over 1 TB. Savings: 15–40% on eligible object storage.
  • 19. Review and remove S3 multipart upload incomplete parts — Failed or incomplete multipart uploads accumulate silently. Enable S3 lifecycle rules to abort incomplete multipart uploads after 7 days. Savings: Variable; can eliminate hidden storage costs entirely.
  • 20. Identify S3 buckets with versioning and old versions — S3 versioning keeps every previous version of every object. Old versions accumulate indefinitely without lifecycle policies. Add lifecycle rules to delete old versions after 30–90 days. Savings: 20–80% on over-versioned buckets.

Database (7 Items) Potential: 8–20% of total bill

  • 21. Disable Multi-AZ on dev and staging RDS instances — Multi-AZ doubles RDS cost. Dev and staging databases almost never need it. Check all RDS instances and disable Multi-AZ for non-production. Savings: 50% on eligible RDS instances (dev/staging only).
  • 22. Rightsize RDS instance classes — Pull CloudWatch metrics for CPU and FreeableMemory over 30 days for each RDS instance. Any database consistently at <20% CPU and >50% FreeableMemory is oversized. Savings: 30–60% per downsized RDS instance.
  • 23. Migrate RDS gp2 storage to gp3 — Same 20% savings as EC2 EBS, now applied to RDS storage. Zero-downtime migration available via console modify with "Apply immediately". Savings: 20% on all RDS storage.
  • 24. Review RDS automated backup retention periods — Default is 7 days in some configurations but many databases have it set to 35 days. Every additional day of automated backups increases snapshot storage costs. Audit and reduce to your actual recovery point objective. Savings: Snapshot storage reduction proportional to retention reduction.
  • 25. Identify idle RDS instances with no connections — An RDS instance with zero database connections in 30 days is effectively unused. Use CloudWatch's DatabaseConnections metric to find them. Savings: 100% of that instance's hourly rate.
  • 26. Evaluate Aurora Serverless v2 for variable database workloads — Applications with highly variable database load (peaks 10x average) are strong candidates for Aurora Serverless v2, which scales down to near-zero during idle periods. Savings: 40–70% vs provisioned Aurora for spiky workloads.
  • 27. Review ElastiCache cluster sizing and multi-AZ — Apply the same rightsizing and multi-AZ discipline to ElastiCache as to RDS. ElastiCache dev/staging clusters rarely need multi-AZ or production-scale node types. Savings: 30–60% on non-production ElastiCache.

Networking (6 Items) Potential: 3–8% of total bill

  • 28. Release all unassociated Elastic IP addresses — Run aws ec2 describe-addresses --query 'Addresses[?AssociationId==null]'. Release every unassociated EIP. Savings: $3.60/month per released EIP.
  • 29. Delete idle NAT Gateways — Check CloudWatch BytesOutToDestination for each NAT Gateway. Zero or minimal traffic over 14 days = idle gateway. Savings: $32.40/month per deleted NAT Gateway plus data processing charges.
  • 30. Deploy VPC Endpoints for S3, DynamoDB, and ECR — Traffic to S3 and DynamoDB through NAT Gateways is charged at $0.045/GB processing. VPC endpoints route this traffic within the AWS network at no charge. Savings: Highly variable; $100–$5,000+/month for data-intensive workloads.
  • 31. Review cross-AZ data transfer patterns — Microservices calling each other across Availability Zones pay $0.01/GB each way. Use CloudWatch metrics to identify high cross-AZ talkers and evaluate co-locating them. Savings: Variable; significant for data-intensive cross-AZ flows.
  • 32. Delete unused load balancers — Find ALBs and NLBs with no healthy targets (or with zero RequestCount for 14 days). Delete idle load balancers. Savings: $16–$18/month base + data processing per deleted LB.
  • 33. Review CloudFront distribution costs and cache hit rates — CloudFront distributions with low cache hit rates (under 80%) mean you're paying for origin data transfer that could be cached. Tune cache behaviors to improve hit rates. Savings: 20–50% on origin transfer costs for improved caching.

Reserved/Committed Spend (6 Items) Potential: 20–40% on covered spend

  • 34. Purchase a 1-year Compute Savings Plan for baseline EC2 spend — After completing rightsizing (items 2–3), buy a Compute Savings Plan covering 70–80% of your baseline EC2 spend. Do this AFTER rightsizing so you're committing to the right amount. Savings: 33–40% on covered spend.
  • 35. Buy RDS Reserved Instances for stable production databases — Any RDS instance that's been running stably for 6+ months at its current size is a RI candidate. Purchase 1-year No Upfront to start. Savings: 40% vs on-demand for 1-year No Upfront RIs.
  • 36. Review Savings Plan utilization in Cost Explorer — Navigate to Savings Plans → Utilization. Any existing Savings Plan below 80% utilization is over-committed. Understand why and adjust future purchases accordingly. Impact: Prevents future waste from over-committed Savings Plans.
  • 37. Set Savings Plan expiration calendar reminders — Savings Plans and RIs expire at end of term. Without renewal, you revert to on-demand pricing overnight. Set 60-day and 30-day reminders before every expiration. Impact: Prevents costly lapses in coverage.
  • 38. Evaluate ElastiCache and OpenSearch Reserved Instances — Apply the same RI discipline to non-EC2 services. ElastiCache RIs save up to 40% for 1-year No Upfront. Savings: 40–55% on covered service spend.
  • 39. Consider RI Marketplace for over-committed Standard RIs — If you have Standard RIs that are consistently under-utilized (below 50%), list them on the AWS RI Marketplace. You'll typically recover 80–95% of remaining value. Impact: Recovers stranded commitment value.

Governance & Tagging (5 Items) Potential: Enables ongoing savings

  • 40. Audit tag coverage across all resources — Use AWS Resource Groups Tag Editor to find all untagged resources. Set minimum required tags: Environment, Owner, CostCenter, and Project. Resources without tags are invisible to cost allocation and chargeback. Impact: Enables accurate cost attribution and accountability.
  • 41. Create AWS Budgets for every account and key service — Set budgets for: total account spend, EC2 spend, RDS spend, and data transfer. Configure alerts at 80% and 100% of budget. This is the simplest early-warning system for cost anomalies. Impact: Catch overspends before they compound.
  • 42. Enable Cost Anomaly Detection monitors — In Cost Explorer, create anomaly monitors for your top 3–5 services by spend. Configure SNS/email alerts for anomalies above $50. Anomaly detection catches cost spikes you'd miss with static budget alerts. Impact: Reduces MTTR for cost anomalies from days to hours.
  • 43. Implement AWS Service Control Policies (SCPs) for cost guardrails — In AWS Organizations, use SCPs to prevent provisioning of expensive instance types without approval, block resource creation in unapproved regions, and require tags on all resource creation. Impact: Prevents future waste at the policy level.
  • 44. Set up monthly cost review cadence — Schedule a 30-minute monthly cost review with your engineering team. Review: top 5 services by spend change, rightsizing recommendations, and RI/SP coverage. Costs only stay optimized if someone is looking at them regularly. Impact: Ongoing — prevents drift back to wasteful patterns.

Tooling (3 Items) Potential: Enables 2–5x faster optimization

  • 45. Enable AWS Cost Explorer and configure saved views — If you haven't enabled Cost Explorer yet, do it today. Create saved views for: EC2 spend by instance type, storage by service, and spend by environment tag. See our complete Cost Explorer guide for setup instructions. Impact: Baseline visibility required for all other optimizations.
  • 46. Deploy Kubecost for Kubernetes cost visibility — If you're running EKS or other Kubernetes workloads, install Kubecost (free community edition) to get namespace and workload-level cost attribution. Combined with VPA recommendations (see our K8s cost guide), this typically surfaces 30–40% savings opportunities in K8s environments. Savings: 30–40% on Kubernetes spend via rightsizing.
  • 47. Connect a dedicated FinOps platform for automated scanning — For teams spending $10K+/month, a dedicated FinOps tool pays for itself immediately. Cloud Hero AI's Hero Savings connects to your AWS account via read-only IAM role and surfaces prioritized savings opportunities with one-click remediation — in minutes, not days. Savings: 3–10x ROI on tool cost from first-month findings.

Let Hero Savings Run This Checklist For You

Instead of manually working through all 47 items, connect your AWS account to Hero Savings and let our AI surface your specific opportunities — with exact dollar values, prioritized by impact, and one-click remediation where available.

Get My Free Savings Report →

Frequently Asked Questions

How long does it take to complete this entire checklist?
For a single AWS account with one engineer focused on it: 2–4 weeks to complete all applicable items, depending on complexity. The quick wins (EIPs, orphaned volumes, gp2→gp3) take a few hours each. Rightsizing requires 1–2 weeks of monitoring before acting. Savings Plan purchases require analysis but can be done in a day once you have the data. Governance items (tagging, SCPs) are the most time-intensive if starting from scratch. Use a project management tool to track progress and don't try to do everything in a single sprint.
Which items on this checklist give the fastest ROI?
In order of speed + impact: (1) Delete orphaned EBS volumes — takes 1 hour, shows up on next bill; (2) Disable Multi-AZ on dev/staging RDS — 30 minutes, immediately halves those RDS costs; (3) Release unused Elastic IPs — 15 minutes; (4) Buy a Compute Savings Plan — 1 hour of analysis, 33–40% savings on covered spend immediately. These four actions alone typically save $2,000–$20,000/month for teams spending $50K+/month on AWS.
Should I rightsize before or after buying Reserved Instances?
Always rightsize first, then commit. If you buy Reserved Instances for your current (oversized) fleet and then downsize, you're stuck with RI commitments that don't match your new instance sizes. The correct order: (1) identify and implement rightsizing opportunities, (2) let the rightsized fleet run for 2–4 weeks to confirm stability, (3) analyze your stable baseline usage, (4) purchase Savings Plans or Reserved Instances based on that baseline. This sequence ensures you commit to the right amount at the right size.
What if I'm not sure whether a resource is safe to delete?
The rule of zero activity: if a resource has had zero meaningful activity (zero network traffic, zero API calls, zero CloudWatch metrics activity) for 30+ days, it's almost certainly safe to delete. Before deleting: (1) check if it's referenced in DNS, load balancer configs, or application settings; (2) snapshot or archive it if there's any doubt; (3) tag it as "marked-for-deletion" and wait 7 days before deleting — if anyone objects, you still have the resource. This "soft delete" approach eliminates the risk of deleting something important while still eliminating the cost.
How do I convince my team to prioritize cloud cost optimization?
Frame it in terms that resonate with your audience. For engineers: "we could hire two more engineers with the money we're wasting on idle EC2 instances." For finance: "our cloud bill has grown 45% year-over-year while our usage has grown 15% — that gap is pure waste we can recover." For executives: "we're spending $X/month on cloud and our analysis shows $Y/month is waste — here's the 30-day plan to recover it." Attaching specific dollar amounts to specific resources (not just percentages) is the most persuasive approach. Run a quick scan with Hero Savings and bring a real dollar figure to the conversation.