
Can you auto scale systems on demand without worrying about expenses? If not, Here's why :
Manual changes making each server "a work of art".
Stateful systems leading to non-zero downtimes for upgrades, backups & restoration, no high availability.
Scaling becomes time consuming and costly.
Although every business in todays’ digitally transformed work aims at becoming the bestseller in its domain by delivering quality apps and services. However none of them are running businesses without a budget.
Organisations often face the crucial decision of whether to deploy their server systems on the cloud or on-premise. Each option comes with its own set of advantages and challenges, particularly when it comes to building immutable systems, configuration management, financial operations (FinOps), hardening, and patching. Here's what should be done:
Building Immutable Systems:
If the recent pandemic has taught us something, it's that anything that mutates, becomes untraceable and unmanageable. Servers for compute must be considered commodity items that can be created and destroyed on demand. They must not have any critical data stored on them.
In-order to standardise the configuration across all environments ensure that the compute servers are built from standard version- controlled machine images. There must be separate disk partitions for the root operating system and apps running on them. This will allow for zero downtime upgrades and config changes.
Once standardised in size, volume and configuration, these can be scaled out and down as and when demands rise and fall. This will also prevent over or under utilised infra resources.
With a standard system package you can build as many virtual machines as you want with the same config, this will get rid of the most common excuse that we have heard
“it works on my machine but not in prod”.
Cloud | On-Premise |
Cloud platforms like Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP) provide robust tools and services for building immutable systems. Infrastructure as Code (IaC) tools such as AWS CloudFormation, Azure Resource Manager (ARM), and Terraform enable the creation of infrastructure that is version-controlled, easily reproducible, and can be treated as disposable entities. | On-premise, on the other hand, typically requires more manual intervention for achieving immutability. Configuration management tools like Puppet, Chef, or Ansible can automate the provisioning and configuration of servers, but maintaining immutability often involves additional efforts and discipline. |
But wait, we are not done yet. Like constructing a house, we have just built the rooms , it still needs the interior that will make it a home to live in and this interior designing of systems is called “Configuration Management”.
Configuration Management:
We need to customising the softwares on the systems like say OS Upgrades:
Let’s say you have a centos 8.1 system image and have like 50 virtual machines that you created to run apps and services. After some months there is a new security patch that needs to be applied. It’s much more efficient to use version controlled config management to just apply the patch on top of the system and thereby keeping a track of what changed in the state data.
Likewise, activities such as cpu and memory upgrade or disk extension also can be done using the config management system without recreating the whole server, thereby preventing downtime as well as time and effort. Patching process can be triggered on demand if required.
Cloud | On-Premise |
Cloud environments offer centralised management consoles and APIs that facilitate streamlined configuration management. Services like AWS Systems Manager and Azure Automation allow for efficient configuration, patching, and compliance management across a fleet of cloud resources. | On-premise configuration management can be more complex, requiring the setup of dedicated configuration management servers and agents on each server. While tools like Puppet and Chef are popular choices, they require additional infrastructure and maintenance overhead. |
FinOps:
Compute "Systems" is the first layer from which you will start noticing operational expenses since it's the first chargeable service that you will encounter in building cloud infra today. (Basic network setup on cloud is negligible).
Although every business in todays’ digitally transformed work aims at becoming the bestseller in its domain by delivering quality apps and services. However none of them are running businesses without a budget.
Subscription/prepaid/commitment billing model instead of the default pay-as- you-go plans while provisioning compute instances even if it's for testing for a month or so. Trust me you will see the difference within a month.
Cloud | On-Premise |
Clouds usually provide a variety of options in machine types and sizes, so much so that you may be spoilt for too many options. For optimised operations define a standard instance family with T-shirt sizing for creating machines based on the performance they demand. For instance, you can classify them as cpu intensive, memory intensive, i/o intensive etc. Select only one family for each requirement. That way both FinOps and operations will be manageable. | On-premise infrastructure requires significant upfront capital expenditure (CapEx) for purchasing hardware and ongoing operational expenses (OpEx) for maintenance, electricity, and cooling. |
Cloud platforms offer granular billing and cost management features, allowing organisations to monitor, optimise, and control their spending in real-time. Services like AWS Cost Explorer and Azure Cost Management provide insights into resource usage and cost allocation, enabling better decision-making and cost optimisation strategies. | While it offers predictable costs over the long term, it lacks the flexibility and scalability of cloud-based pay-as-you-go models. |
Hardening & Patching:
Images, OSes and platforms come from a trusted vendor or approved repository and are hardened (no 3rd party downloads). All servers, appliances and devices must have a complex password policy. They must be scanned monthly for vulnerabilities.
Cloud | On-Premise |
Cloud providers manage the underlying infrastructure, including hardware security and network protection. They also offer managed services for security patching and compliance, reducing the burden on organisations to maintain and update their systems. | On-premise systems require organisations to take full responsibility for hardening and patching, including securing network configurations, applying OS patches, and managing firewall rules. While tools like Microsoft SCCM and Red Hat Satellite can automate patch management, they require additional setup and maintenance. |
Summarising Server Management:
Cloud platforms offer more native support and tooling for building immutable infrastructure, making it easier to implement and manage compared to on-premise solutions.
They provide more integrated and user-friendly tools for configuration management, simplifying the process and reducing operational overhead.
The winner in FinOps depends on the scenario. Cloud offers greater flexibility and scalability with pay-as-you-go pricing, but on-premise infrastructure may be more cost-effective for predictable workloads with steady demand over time.
Cloud providers offer more comprehensive and automated solutions for hardening and patching, relieving organisations of much of the operational burden associated with maintaining on-premise systems.
For a secure, scalable & sustainable software development you need
Compute systems that are easy to scale out on demand during happy hours.
Servers that are secured enough to host public facing applications
Prevent overspending or underspending on infrastructure
In conclusion, the choice between deploying server systems on the cloud or on-premise depends on various factors, including budget, scalability requirements, and the level of control and customisation needed. While the cloud offers unparalleled scalability, flexibility, and managed services for tasks like configuration management, patching, and cost optimisation, on-premise infrastructure may be preferable for organisations with strict compliance requirements, predictable workloads, or legacy systems that cannot easily migrate to the cloud.
Ultimately, organisations must carefully evaluate their specific needs and priorities to determine the most suitable deployment model for their server systems. Whether in the cloud or on-premise, the goal remains the same: to build resilient, secure, and efficient infrastructure that supports the organisation's business objectives.
If you like this article, I am sure you will find 10-Factor Infrastructure even more useful. It compiles all these tried and tested methodologies, design patterns & best practices into a complete framework for building secure, scalable and resilient modern infrastructure.
If you like this article do like 👍 and share ♻ it in your network and follow Kamalika Majumder for more.

Thanks & Regards
Kamalika Majumder
Comments