‘Shift-left’ reduces cloud cost by 50% for a Fortune 500 insurance firm

While cloud adoption continues to accelerate, with 36% of enterprises spending more than $12 million per year on public clouds, businesses are looking for ways to optimize their cloud spend. According to Flexera 2021 State of the Cloud Report , organizations waste 30% of cloud spend, making controlling cloud cost the top challenge in public cloud adoption.

IDC Cloud Survey 20201
Source: IDC Cloud Survey 2020.

Gartner predicts that through 2024, nearly all legacy applications migrated to public cloud infrastructure as a service (IaaS) will require optimization to become more cost-effective. This blog details how we helped a Fortune Global 500 insurance brokerage and risk management company halve their cloud cost by optimizing their spend.

A few months after implementing an AWS-based data lake for a Fortune 500 firm, we discovered that their in-house IT team was unable to control cloud consumption costs. They were using AWS Cost Explorer, which monitored cloud costs for different teams in silos, without optimizing usage.

The realization prompted us to explore why the cost spiraled with increasing amounts of data. To optimize cost, we adopted a “shift-left” approach, which is widely used for maintaining code quality. Our focus was on empowering the Dev teams, the primary consumers of cloud services and resources, to gain more visibility into the costs based on their usage. This, in turn, would make them more accountable and enable them to monitor costs effectively.

Cloud Optimize Cloud Cost

To optimize cloud cost, the Impetus DevOps team:

  1. Defined tagging policy and guidelines for AWS resources with tag keys like application, business unit, environment, etc.
  2. Updated the automation scripts to ensure all created resources were tagged
  3. Created custom rules to scan all resources and report non-tagged resources to the management
  4. Created a tagging strategy for various services to filter resources for cost analysis.

    The pie chart below highlights the cost distribution between tagged and untagged resources. As most of the cost was incurred by tagged resources, we identified the teams and applications incurring the cost, which helped avoid resource wastage.

    Cost distribution of tagged vs non-tagged resources
    Cost distribution of tagged vs non-tagged resources
  5. Created alerts for services reaching/exceeding threshold
  6. Enabled automated termination of long-running EMR clusters
  7. Enabled termination of idle EMR clusters on weekends
  8. Compared historical price patterns and peaks with existing usage to effectively plan an auto-scaling mechanism
  9. Reported idle/stopped resources and security vulnerabilities
  10. Created a tagging solution for all services and components
  11. Created dashboards for billing and custom dashboards for tagged resources
  12. Designed a strategy to gather information from all accounts in a single billing dashboard on AWS. This helped to compare consumption across Dev, Stage, Prod, and Labs accounts. The report was further analyzed to optimize Dev spending and formulate stricter policies on resource provisioning and termination
  13. Implemented lifecycle rules on Amazon S3 buckets to automatically move old data to lower storage tiers, resulting in reduced storage costs

    The following graph highlights the cost reduction after implementing S3 policies.

    Cost reduction using S3 lifecycle management.png
    Cost reduction using S3 lifecycle management
  14. Used spot instances for EMR core nodes, which are up to 50% cheaper than on-demand instances

The customized cost analysis solution with built-in controls provided specific access to Dev teams to continuously monitor and track cloud cost usage for taking corrective actions based on their consumption. The solution enabled the team to:

  1. Connect with AWS, Azure, and GCP accounts to collect and analyze cloud costs in one place
  2. Get quick insights categorized by region, services, tags, and more
  3. Simplify workflows to manage the resource tags and maintain tag hygiene for accurate cost analysis
  4. Get rule-based alerts for any spikes and recommendations to improve cost-efficiency
  5. Reduce multi-cloud wastage by analyzing spends against budgets and forecasting costs and usage

The solution helped the Fortune 500 insurance firm reduce their monthly cloud expenditure from $120000 to $60000.

Dashboard of the customized cost analysis solution
Dashboard of the customized cost analysis solution

Monitoring cloud costs continuously and optimizing resource utilization are critical for reducing cloud spend and realizing the benefits of the cloud. While every cloud provider offers reports and dashboards to track resource consumption and costs, correlating the data, identifying provisioning inefficiencies, controlling virtual sprawl, and analyzing cloud spend across multiple providers can be challenging. The implemented solution (a custom cost explorer) simplifies hybrid and multi-cloud cost analysis by consolidating expense data from different cloud platforms and accounts. It helps businesses optimize their cloud expenditure and predict future costs.

Author
Mustufa Batterywala
Senior DevOps Architect