
Zero Downtime Deployment is a deployment model where zero downtime occurs for the end user services, in short no downtime experienced by the users using that service/platform/system.
This includes strategies for blue-green, rolling and canary deployment for platform upgrades, infrastructure scaling, application deployment or any other change event.
To achieve this, high availability for data and persistent storage using network file storage with geo-replication across datacenters is a must.
Here's a story from one of my past projects where zero downtime deployment played a key role in digital transformation of a traditional bank.
As a part of the #DigitalTransformation journey A Bank in Indonesia wanted to launch a Digital Banking Product that would give their customers a 24/7 banking solution.
They required a platform that could seamlessly take the load of several thousands of customers who were going to register across different parts of two major cities in Indonesia.
The objective was to provide an AlwayOn PaaS that was highly stable with robust monitoring and alerting system and easy to scale out on demand as the utilisation increases.
The biggest challenge was to onboard the concept of AlwaysOn Systems & Services that will be be live 24/7.
The compute platform was setup manually which overtime had changes in the system configuration that were not version controlled. So making any change would lead to risking the active development cycles and in turn impacting production release.
One of the major initiative taken was to stabilise the underlying compute layer where the application was running. We re-designed the entire deployment model for the compute platform and built a solution that would spin up a containerisation cluster stretched across two datacenters within 20 minutes.
Why Zero Downtime Deployment?
For Digital Banks who promise to operate 24/7, cannot afford downtimes anytime.
For SaaS who promise 99.99% SLA to their customers, only gets following downtime/unavailability window which is practically zero downtime:
Daily:Â 8.6s
Weekly:Â 1m 0.48s
Monthly:Â 4m 21s
Quarterly:Â 13m 2.4s
Yearly:Â 52m 9.8s
For E-Commerce a highly performant and available services can decide whether they will survive all the market competition.
At the same time, keeping all softwares updated is a must for security and scalability.
In essence, users do not experience any disruption in service, maintaining a seamless experience. This approach has become increasingly critical in industries where uptime is synonymous with customer satisfaction, revenue generation, and brand reputation.
Let’s understand these deployment strategies & infrastructure designs that are necessary for building the foundation of Always On Digital Services:
Deployment Strategies for Zero Downtime:
Blue-Green Deployment:
This involves maintaining two identical environments, namely the 'Blue' and 'Green' environments.Â
While one environment serves live user traffic, the other remains idle. During deployment, traffic is seamlessly redirected to the idle environment.Â
This approach minimises downtime, as the transition between environments is swift and transparent to end-users.
Rolling Deployment:
This is a phased approach where updates are gradually applied to different parts of the system while the remaining components continue to operate.Â
This ensures that a portion of the system is always available, mitigating the risk of downtime.Â
The deployment progresses incrementally, allowing the system to adapt to changes without causing disruptions.
Canary Deployment:
In this method is a risk-mitigation strategy that involves releasing a new version of the software to a small subset of users before making it available to the entire user base.Â
This allows organisations to identify and address issues before a full-scale release.Â
Gradually expanding the release ensures that any potential problems are detected early on, minimising the impact on users.
Infrastructure Designs For Zero Downtime Deployment:
Multi AZ/DC Services:
Geo-replication involves replicating data across multiple datacenters located in different geographical regions.Â
This ensures that even if one datacenter experiences an outage or maintenance, the data remains accessible from another location.Â
Geo-replication enhances the resilience of the storage infrastructure, contributing to the overall high availability of the system.
High Availability for Persistent Storage:
While deploying applications with zero downtime is crucial, ensuring high availability for persistent storage is equally paramount.Â
Persistent storage, which includes databases and file systems, forms the backbone of many applications.Â
Any disruption to persistent storage can result in data loss, application errors, and downtime. To address this, employing network file storage with geo-replication across datacenters becomes a necessity.
Network File Storage:
Network file storage solutions provide centralised and scalable storage accessible over a network.Â
This allows applications to access data seamlessly, irrespective of the underlying infrastructure changes.Â
By decoupling storage from compute resources, organisations can update or scale their infrastructure without affecting data availability.
Single Secure Entrypoint:
Make sure the device where requests land first is secure. It must be a single trusted Point endpoint exposed with IP Segregation, DNS based backend mapping, WAF features that implements OWASP policies, DDoS Protection.
You can use services like Cloudflare, Akamai. These also come with CDN that helps keeping your app endpoint flexible to point to wherever you backend it, helps in static asset migrations.Configure certificate management for SSL certificate generation, renewal and offloading. Mutual TLS to establish zero trust policy with any third party application endpoint.
A single secure entry point ensures seamless switching between the endpoints as described in the deployment strategies mentioned above. You customers will always use that single endpoint which will redirect the traffic as required to the respective backend api/instance/cluster.
Centralised Command Center:
Last but not the least, a one stop centralised command center for infrastructure administration is necessary to prevent vendor lock-in enable cloud agnostic topology. This helps to enable centralised logging, monitoring and identity management for better observability.
Conclusion:
Zero Downtime Deployment is a must for operating AlwaysOn services in digital transformation.Â
Employing strategies such as blue-green, rolling, and canary deployments allows for seamless updates and changes without disrupting the end-user experience.Â
Furthermore, ensuring high availability for persistent storage through network file storage with geo-replication across data centers is crucial in safeguarding data integrity and system resilience.Â
By adopting these practices, businesses can navigate the ever-evolving technological landscape while maintaining a commitment to customer satisfaction and operational excellence.
If you like this article, I am sure you will find 10-Factor Infrastructure even more useful. It compiles all these tried and tested methodologies, design patterns & best practices into a complete framework for building secure, scalable and resilient modern infrastructure.Â
Don’t let your best-selling product suffer due to an unstable, vulnerable & mutable infrastructure.
Thanks & Regards
Kamalika Majumder
Comments