Ten best practices for containerization on the cloud

By 2022, more than 75% of global organizations will be running containerized applications. – Gartner Inc.

Containerization represents a breakthrough for DevOps teams as it lets them focus on application architecture and deployment, rather than worrying about the underlying infrastructure plumbing. As lightweight software units that package applications with their code dependencies, containers make it easy to create cloud-native applications running on physical or virtual infrastructure. Based on our recent engagements with Fortune 1000 companies, we have put together ten best practices that can help you accelerate application deployment on the cloud using containers.

Use a hybrid strategy for application modernization

A popular approach to modernizing monolithic applications is to “lift and shift,” wherein the entire application is bundled as a container and deployed. While this is easy and fast to execute, development teams find it difficult to push frequent and small changes to applications that have been deployed as a whole. Another approach is to completely rearchitect an existing application, but this process often proves to be complex and time-consuming. To save time and effort, we recommend a hybrid approach that combines the best of both these worlds. This involves analyzing your applications based on usage patterns and identifying modules that can be decoupled from the application and containerized. You can use conversion tools like AWS App2Container to discover your on-premises applications and automatically create Docker images for these. You can also leverage services like AWS Fargate, Azure Container Instance, or Google Cloud Run to rapidly productionalize your containers on the cloud.

We used this hybrid approach to help a US-based healthcare technology company containerize their Windows-based legacy applications and move to a cloud-agnostic architecture leveraging Azure Kubernetes Service. This helped them navigate deployment complexities and reduce the time and effort spent on customer support while eliminating Windows dependencies.

Follow the Twelve-Factor App methodology for cloud-native development

Developers and DevOps engineers can leverage the Twelve-Factor App methodology to build containerized applications that are resilient, portable, and cloud-agnostic. This methodology spells out best practices across twelve important factors, including codebase, dependencies, configurations, processes, concurrency, and logs, among others. Implementing these guidelines helps enterprises achieve the level of innovation, speed, and agility they need to succeed in the marketplace. You can apply the Twelve-Factor methodology to applications written in any programming language, regardless of which combination of backing services (database, queue, memory cache, etc.) you use.We successfully used this methodology to develop a self-service web portal for a Fortune Global 500 insurance brokerage firm, leveraging containers for deployment. The portal enabled business users to seamlessly ingest data on their AWS data lake and reduced ingestion time from hours to minutes.

Include only necessary dependencies in the build

To keep containers as lightweight as possible, it is important to include only the necessary dependencies. Picture this – you are building a classic Apache/MySQL/PHP stack and are tempted to run all the components in a single container. However, the best practice is to use different containers for Apache, MySQL, and PHP (if you are running PHP-FPM). We suggest following the single-responsibility principle (SRP) to write clean, well-structured code focused on a single functionality, as this helps limit the dependency on other application components. We also recommend building smaller container images to enable faster uploads and downloads – the smaller an image, the faster it can be downloaded and run. In addition, smaller images can help you lower cloud storage costs and quickly scale up or down in response to application user traffic.

The Fortune Global 500 insurance brokerage firm also wanted to enable their end-users to upload data files to AWS and interact with Amazon SQS. By reducing the number of dependencies in the build, we were able to reduce the image size by almost 200 MB, enabling a seamless user experience.

Optimize for build cache

Creating effective, clean images is a vital step in containerization. When building an image, Docker skims through the instructions in your Dockerfile and executes them in the specified order. It also looks for reusable images in its cache and reuses layers from previous builds. This helps avoid the potentially costly step of recreating an image and helps improve build time significantly. To eliminate intermediate image layers, try reducing the number of instructions in your Dockerfile. For instance, you can choose to have a single command with all installation inputs instead of having multiple commands for the same.

We recently used this practice while building a Docker container for a machine learning algorithm. The image required multiple Python dependencies and the initial Docker file had separate “yum install” commands for each installation. By including all the installations in a single command, we could reduce intermediate image layers and reuse images easily across models.

Integrate image scanning as part of your CI/CD pipeline

In today’s ever-changing threat landscape, using trusted image sources is not enough. To ensure airtight security, you must integrate container image scanning with CI/CD tooling and processes. All images should be scanned in line with the organization’s security policy each time CI/CD is run to minimize the risk of attack vectors being installed on the organization’s network. As image build policies typically reside in a security engine, any security policy failure should trigger a failed build directly within your CI/CD system and provide the necessary remediation steps. To enable this, we recommend integrating image scanning as part of your pre-created CI pipelines. You can also leverage open source tools like Anchore and Qualys for performing image vulnerability scans each time an image is created with a new code.

For the insurance brokerage firm mentioned earlier, we developed a single-click Jenkins pipeline to automatically deploy containerized services to development, staging and production environments. This helped the client shorten their release cycle from 6 weeks to 1 week.

Monitor telemetry data for your entire stack

In the containerization universe, monitoring should not be limited to infrastructure. You should closely monitor all aspects of applications, such as logs, load time, and the number of HTTP requests. In terms of errors, ensure that your monitoring strategy covers application exceptions, database errors/warnings, and weblogs indicating unusual requests, etc. It is equally important to monitor cloud-specific telemetry and check for outages and internet latency. While cloud providers offer monitoring services like AWS CloudWatch, Azure Monitor, and Google Stackdriver, you should augment these with advanced tools to monitor network ingress and egress traffic, security breaches, and platform availability. You can also integrate monitoring and log analytics capabilities with cluster creation using tools like Prometheus, Grafana, Elastic, Fluentd, Kibara and Jaegar. These, coupled with a unified monitoring dashboard, can help you realize 360-degree visibility and observability across your containerized applications and environments on the cloud.

For a digital customer journey experience company, we set up a monitoring dashboard using Datadog and Splunk for platform and application level monitoring. We also integrated their Docker environment and containers with ELK to support application debugging. These tools helped the customer achieve complete visibility of their infrastructure, Docker platform, and applications.

Tag your images

Docker images are generally identified by two components – name and tag. For instance, for the image “google/cloud-sdk:193.0.0”, “google/cloud-sdk” is the name, and “193.0.0” is the tag. If you do not provide a tag in your Docker commands, the system uses the latest tag by default. At any given time, the name and tag pair is unique, but the same tag can be reassigned to a different image if needed. When you build a container image, be sure to tag it accurately as this helps in versioning and easy rollback during deployment. We recommend following a consistent, well-documented tagging policy that can be easily understood by image users.

The digital customer journey experience company leveraged these tagging practices to efficiently manage images in Amazon ECR, purge old images, and save cloud storage costs.

Decouple containers from infrastructure

Containers enable “write once, run anywhere” portability and performance isolation on shared infrastructure. This means that databases and containers are decoupled from the operating system and IT infrastructure to provide workload portability from one host to another, anywhere, any time. However, sometimes we come across stateful containers tightly coupled with a storage layer on the infrastructure, which becomes a major bottleneck for scaling. You can address this by using network-based storage or cloud object storage on Amazon Elastic Kubernetes Service (EKS) or Azure Kubernetes Service (AKS), as these can be easily accessed from any node in the cluster.

While recently containerizing a data intelligence and analytics application on Kubernetes, we used NFS storage to decouple stateful components like RabbitMQ and Elastic from the host infrastructure. We could then run these components across any node in the cluster, making the deployment scalable and easy to manage.

Declare resource requirements

All container deployments should declare resource requirements like storage, compute, memory, and network to ensure that resources are not utilized infinitely. It is equally important that your applications stay confined to these indicated resource requirements as they are less likely to be terminated/migrated if resource starvation occurs. Declaring resources also helps DevOps teams set monitoring alerts and take informed decisions related to scaling. This is especially important in the cloud, where resources can scale automatically, leading to increased costs. For effective and optimized scheduling, ensure that your containers declare resource limits and requests clearly.

For the data analytics application mentioned above, we used Kubernetes deployment artefacts at the pod and namespace level to restrict memory and CPU core usage. Declaring requirements like CPU usage helped us effectively procure resources from the cluster nodes and ensure hassle-free scheduling.

Automate the cluster creation process

It is usually easy for small teams to build containers for simple applications and then deploy these on the cloud or on-premises. But when multiple teams work on complex applications, management can become an issue. To isolate resources across applications, we recommend using smaller clusters instead of leveraging one large cluster. This makes application and infrastructure management simple and hassle-free. Automating the cluster creation process helps create smaller clusters seamlessly. We recommend leveraging single-click deployment scripts to set up Amazon EKS or AKS clusters with best practices for availability, monitoring, and security. These scripts can also be integrated with automation tools like Jenkins, enabling your DevOps teams to quickly create clusters for application onboarding.

Leveraging automated Kubernetes scripts, we helped a US-based Fortune 100 credit card company save ~50 hours of set up time per cluster.

Containerization is rapidly gaining traction as it helps enterprises shorten their application development and release cycle, while reducing hardware expenses. Impetus Technologies offers ready-to-use enablers and innovative automation levers to accelerate, simplify, and de-risk your containerization initiatives in a cloud-first world.

Choose a lab aligned to your Data & AI journey

Address your desired use case across critical analytic dimensions

STRATEGY LAB

Collaborate with experts on strategic objectives
Identify and select core technologies
Ensure IP governance and protection
Align business outcomes with goals

$10K value, complimentary for qualified organizations

DESIGN LAB

Explore architecture options with experts
Ensure alignment of business and technology
Architect an ideal solution for a pressing problem

$100K value, complimentary for qualified organizations

BUILD LAB

Validate or refactor existing architecture
Develop a prototype with expert guidance
Establish a roadmap to production

$240K value, offered for a $50K fixed price

Learn more about Data & AI Labs