Optimizing AWS Costs for a Leading E-commerce Enablement Platform

Introduction

Our client is a fast-growing SaaS platform that powers warehouse, order, and inventory management for over 15,000 brands and sellers. As a key technology enabler in the e-commerce ecosystem, the client offers a cloud-based solution that handles real-time inventory synchronization, multi-channel fulfillment, and post-purchase experience automation.

To support this scale and complexity, the client runs a modular architecture on AWS, serving both domestic and international users with strict uptime, latency, and security requirements. However, as the business scaled aggressively, the cloud infrastructure evolved reactively, resulting in ballooning costs, opaque usage patterns, and underutilized resources.

Challenge

Situation

The client’s AWS bill had grown significantly over the past 12 months. While revenue and platform usage had increased, cost spikes were outpacing business growth. Internal DevOps teams were stretched thin, and engineering efforts were focused on product delivery, not cost governance.

Multiple environments (production, staging, UAT) were always-on, RDS resources were provisioned for peak load but rarely optimized, and media assets were being stored indefinitely in S3 without a lifecycle policy. Despite using standard AWS tools, there was no CMS-level correlation between Drupal usage and AWS billing data, making optimization difficult. As their business scaled, their cloud costs, especially from AWS, grew much faster than anticipated.

With multiple teams working across development, testing, and live customer environments, the platform was becoming heavier to manage and costlier to run. That’s when they partnered with Valuebound to get control over their cloud spend, without slowing innovation.

The client approached Valuebound with a clear ask: reduce AWS costs without disrupting platform performance or engineering velocity. They approached to solve one simple problem: “How do we bring our AWS bill down, without breaking things?”

Challenges

Challenge 1: High Cost from Idle Resources

Their development, testing, and demo environments were running all the time, even when no one was using them. These non-customer environments were using nearly a third of their monthly infrastructure budget.

Staging, dev, QA, and demo environments were consuming compute and storage 24/7. These environments were not gated by schedules or automation, resulting in nearly 30% of EC2 and EBS spend delivering zero business value during off-hours.

Challenge 2: Paying for Capacity They Rarely Used

Their database systems were set up to handle worst-case traffic, not real-world usage. Most of the time, those resources were sitting underused, but still being fully billed.

The production RDS instance was running with provisioned IOPS and high memory allocation, despite average CPU utilization staying below 25%. Performance spikes were caused by unoptimized Drupal Views and admin-side data exports, resulting in misattributed compute needs and unnecessary cost scaling.

Challenge 3: Media and Files Were Storing Up Forever

Over the years, the platform had accumulated a large volume of images, logs, and documents. None of it had been set to expire, archive, or move to cheaper storage. This was silently inflating their bill.
The client’s S3 buckets hosted gigabytes of media, logs, and product files with no expiry policies.

Temporary assets and outdated logs remained in Standard Storage, and frequent PUT/GET calls from dynamic URLs led to unnecessary CloudFront invalidation costs.

Challenge 4: No Visibility into What Was Costing Them

While they had general usage reports, there was no clear view of what features, teams, or customer actions were driving cloud costs. It made financial planning hard and cost accountability impossible.

The internal team had dashboards tracking EC2 and RDS usage, but no insight into which parts of the Drupal app were driving those costs. High-traffic pages, admin-side exports, API usage- all remained invisible from a cost perspective, making it hard to diagnose spikes or forecast infrastructure needs.

Solution

Resolution 1: Smarter Scheduling

We introduced automation that turned these environments off during nights and weekends, and made them easy to start back up when needed.

We began by mapping all non-prod environments and tagging them by role, owner, and usage frequency. Using AWS Instance Scheduler, we implemented auto-shutdown scripts that powered down EC2, RDS, and EBS resources during nights and weekends. For demo environments used by sales, we enabled on-demand provisioning via a Slack-triggered Lambda workflow, reducing idle infrastructure significantly.

This simple governance layer brought immediate visibility and eliminated silent cost creep from underused resources. A single change delivered immediate savings, without disrupting any internal workflows or engineering velocity.

Resolution 2: Performance Tuning and Right-Sizing

We helped the client better understand when and why their systems needed more power, and when they didn’t.
We used CloudWatch and slow query logs to analyze peak usage periods. From within Drupal, we mapped high-load Views and redesigned their query structures by introducing index-level optimizations, result caching, and conditional lazy loading where appropriate.

Following this, we safely downgraded the RDS instance class and eliminated provisioned IOPS, switching to gp3 SSDs with burstable performance. We also introduced monitoring around batch job schedules to prevent clustering of I/O-heavy operations.

With this insight, we reconfigured things to match actual usage, cutting excess cost while keeping everything fast and stable.

The result: lower baseline cost without affecting platform responsiveness.

Resolution 3: Storage Cleanup and Automation

We analyzed what content was still being accessed and what wasn’t. Then, we introduced automated rules to archive older files and clean up what was no longer useful.

We conducted an age-frequency audit of all S3 objects. Based on access patterns, we implemented intelligent lifecycle rules that automatically transitioned media files to infrequent access (IA) storage after 60 days and archived logs after 90 days.

For dynamically generated assets, we deployed a CDN revalidation strategy that reduced unnecessary CloudFront invalidations and aligned TTL values with actual content update cycles.

We also introduced image compression standards for editorial and product uploads, reducing file sizes by over 35% across the board. This saved both storage costs and sped up the system overall.

Resolution 4: Business-Level Cost Tracking

We implemented a cost dashboard that tied cloud usage directly to specific features and teams.

We created a CMS-integrated cost observability layer by mapping Drupal endpoints and Views to AWS usage metrics. Using CloudWatch logs and custom tagging, we tracked which modules and content types were triggering the most compute load and bandwidth.

This was visualized in a Grafana dashboard showing cost-per-feature, which helped both tech and business teams understand which workloads were revenue-generating, and which ones were wasteful.

This layer of clarity turned optimization from reactive firefighting into continuous, informed decision-making. Now, leadership could see exactly what parts of the platform delivered value, and what was simply costing more than it returned.

Benefits

Within 6 weeks of engagement, the Valuebound optimization program delivered tangible, measurable results:

41% reduction in total AWS spend, without any compromise in uptime or performance
83% drop in idle resource usage across staging, UAT, and demo environments
40% smaller RDS footprint after removal of overprovisioning and inefficient queries
18% decrease in S3 and CloudFront costs due to lifecycle rules and media governance
Faster content delivery with no increase in infrastructure spend, thanks to caching and compression strategies

More importantly, the client now has a transparent cost governance structure, where every AWS dollar spent is tied to a business or platform value. The client runs leaner, faster, and with full clarity on how their infrastructure budget supports business growth.

Wrap up

From Unchecked Spend to Strategic Efficiency

This engagement proved that cutting cloud costs doesn’t have to mean cutting corners. By focusing on the real drivers of waste- idle systems, overbuilt infrastructure, unmanaged storage, and lack of cost visibility- Valuebound helped the client turn AWS from a growing liability into a controlled, efficient asset.

The savings weren’t just in dollars. They gained predictability, internal accountability, and a stronger foundation to scale further, without fear of cost overruns or technical slowdowns.

For any enterprise running Drupal on AWS, this case reinforces a clear message: the smartest cloud strategy isn’t just about what you build, but how efficiently you run it. And that’s exactly where Valuebound leads.

Services Provided

DevOps and Deployment Solution Designing Strategy and Consulting Services DevOps