DIGIT Urban
PlatformDomainsAcademyDesign SystemFeedback
v2.0
v2.0
  • What Is DIGIT Urban?
  • DIGIT Urban Architecture
  • Product & Modules
    • Brochures
    • User Manuals
      • Logging Into DIGIT
      • mCollect
        • Citizen User Manual
        • Employee User Manual
      • Trade License
        • Citizen User Manual
        • Employee User Manual
      • Public Grievance & Redressal
        • Citizen User Manual
        • Employee User Manual
        • Complaint Types List
      • Property Tax
        • Citizen User Manual
        • Employee User Manual
    • Services Overview
      • Core Services
        • Workflow Services
        • Location Services
        • User Services
        • Access Control Services
        • PDF Generation Service
        • MDMS (Master Data Management Service)
        • Payment Gateway Service
      • Business Service
      • Municipal Service
        • PGR Services
        • Trade-License Service
      • Utilities
    • Release Notes DIGIT 2.0
      • BPA Release Notes
      • Trade License Release Notes
      • Property Tax Release Notes
      • PGR Release Notes
      • Water & Sewerage Release Notes
      • Advance Payments Release Notes
      • Configuration Changes
    • DIGIT Roadmap
    • Product FAQs
    • Quality Assurance
  • Configure DIGIT
    • Git Repos
    • Setting up DIGIT
      • Configuring InfraOps
      • Setting up DIGIT Environment
      • Email And SMS Setup
      • FileStore Setup
      • Setting Up SSL Certificate
      • Periodic Log Cleanup
    • Setting up Master Data
      • MDMS Overview
      • Configuring Master Data
      • Adding New Master
      • Configuring Tenants
      • State Level Vs City Level Master
    • Master Data Collection Templates
      • Environment Setup
        • State Level Setup
          • Tenants Information
          • SMS Account Configuration
          • Email Account Configuration
          • Google Play Store Account
          • Payment Gateway Configuration
          • POS Integration Configuration
          • Domain Name Configuration
          • SSL Configuration
          • ULB Departments
          • ULB Designations
          • Localization
          • Google Map Configuration
        • ULB Level Setup
          • Boundary Hierarchies
          • Boundary Data
          • Cross Hierarchy Mapping
          • ULB Bank Accounts
      • Module Setup
        • Trade Licenses Templates
          • Trade Category
          • Trade Type
          • Trade Sub Type
          • Trade License Fee
          • Trade License Documents Attachment
          • Structure Type
          • Structure Sub Type
        • Property Tax Data Templates
          • Road Type
          • Construction Type
          • Property Type
          • Property Sub Type
          • Usage Category Major
          • Usage Category Minor
          • Usage Category Sub Minor
          • Usage Category Detail
          • Ownership Category
          • Ownership Sub Category
          • Owner Special Category
          • Special Category Documents
          • Unit Rates
          • Tax Rates
          • Interest Rates
          • Penalty Rates
          • Rebate Rates
          • Mutation Fee
        • PGR Data Templates
          • Grievance Type
          • Grievance Sub Type
          • Routing Matrix
          • Escalation Matrix
        • Fire NOC Data Templates
          • Building Usage Type
          • Building Sub Usage Type
          • Fire Station Master
          • Areas Served Master
          • Fire Station Mapping
          • Fire NOC Fee
        • mCollect Data Templates
          • Service Category
          • Service Sub Category
          • Service Sub Category GL Code Mapping
        • Web Portals Templates
          • State Portal
          • ULB Portal
        • OBPAS Data Templates
          • List Of Services
          • Service-Wise Documents
          • Building Occupancy
          • Building Sub Occupancy
          • Building Usage
          • Inspection Checklist
          • Stakeholders Type
          • Town Planning Schemes
          • NOC Departments
          • Fee Structure
          • eDCR Drawing
        • HRMS Data Templates
          • User Roles
          • System Users
        • Finance Data Templates
          • Chart Of Accounts
          • Funds
          • Functions
          • Contractors
          • Suppliers
          • Schemes
          • Sub Schemes
          • Bank
          • Bank Branch
          • Bank Account
          • Deductions
          • Opening Balances
          • Sub Ledger Category
          • Sub Ledger Master
        • Water Charges Data Templates
          • Pipe Size Types
          • Water Source Types
          • Water Rates (Metered)
          • Water Rates (Non-Metered)
          • Water Penalty Rates
          • Water Interest Rates
        • Sewerage Charges Data Templates
          • Sewerage Rates
          • Sewerage Penalty Rates
          • Sewerage Interest Rates
        • Billing And Payments Data Templates
          • Tax Heads
          • Receipt Format
          • Demand Bill Format
        • DSS Data Templates
          • KPI Acceptance
        • Workflow Data Templates
          • Workflow Actions
          • Workflow Levels
          • Workflow Process
          • Workflow Notifications
        • Common Configuration Details
          • Standard Document List
          • Service Document Mapping
          • Checklist
          • Configuring Data FAQs
    • Configuring Workflows
      • Setting Up Workflows
      • Configuring Workflows For An Entity
    • Configuring Services
      • API Dos and Don'ts
      • Setting Up Service Locally
      • Configuring New Reports
        • Types Of Reports Used In Report Service
      • Customizing PDF Notices And Certificates
    • Setting up a Language
      • Adding New Language
      • Setting Up Default Language For SMS & Emails
    • Configuring Localization
      • Setup Base Product Localization
      • Configure SMS and Email
    • Setting Up SMS Gateway
      • Using The Generic GET & POST SMS Gateway Interface
    • Configuration FAQs
    • Setting Up eDCR Service
    • Adding Roles To System
    • Mapping Roles With APIs
    • Setting Up Finance Service
    • Adding New APIs For Access
  • Customize DIGIT
    • Frontend/UI
    • DIGIT Customization
      • API Do's & Don'ts
      • Writing A New Customer
    • Services
      • Core Services
      • Business Services
      • Municipal Services
      • Infra Services
    • Master & Configuration data load kit
    • Data Migration
      • Data Migration Principles
      • Data Templates
      • Data Migration Kit
  • Deployment Tools
    • Setup DIGIT
      • Infra Requirements
      • Why Kubernetes for DIGIT
      • Supported Clouds
        • Google Cloud
        • Azure
        • AWS
        • VSphere
        • SDC
        • NIC
      • Infra Sizing
      • Infra Best Practices
      • Deployment Architecture
      • Deploy DIGIT
        • Routing Traffic
        • Backbone Deployment
    • Skills Needed
    • Resource Requests & Limits
    • Readiness & Liveness
    • Troubleshooting
      • Distributed Tracing
      • Logging
      • Monitoring & Alerts
    • CI/CD
    • Security Practices
  • DIGIT Training Materials
    • Training Calendar
    • Training Videos
  • DIGIT Support
    • eGov Enablement Support for DIGIT
    • Troubleshooting Guides
Powered by GitBook

​All content on this page by eGov Foundation is licensed under a Creative Commons Attribution 4.0 International License.

On this page
  • Overview
  • The Big Why
  • Kubernetes Architecture
  • Application Deployment Model
  • Service Discovery & Load Balancing
  • How Services Internally Work
  • Internal/External Routing Separation
  • Usage of Persistent Volumes
  • Deploying Daemons on Nodes
  • Deploying Stateful Distributed Systems
  • Running Background Jobs
  • Deploying Databases
  • Configurations Management
  • Credentials Management
  • Rolling Out Updates
  • Autoscaling
  • Package Management
  • Conclusion
  • References

Was this helpful?

Edit on Git
Export as PDF
  1. Deployment Tools
  2. Setup DIGIT

Why Kubernetes for DIGIT

PreviousInfra RequirementsNextSupported Clouds

Last updated 4 years ago

Was this helpful?

Overview

This page explains why Kubernetes is required. It deep dives into the key benefits of using Kubernetes to run a large containerized platform like DIGIT in production environments.

The Big Why

Kubernetes project started in the year 2014 with . Kubernetes has now become the de facto standard for deploying containerized applications at scale in private, public and hybrid cloud environments. The largest public cloud platforms , , , and now provide managed services for Kubernetes. A few years back RedHat, Mesosphere, Pivotal, VMware, Nutanix completely redesigned their implementation with Kubernetes and collaborated with the Kubernetes community for implementing the next generation container platform with incorporated key features of Kubernetes such as container grouping, overlay networking, layer 4 routing, secrets, etc. Today many organizations & technology providers adapting kubernetes at a rapid phase.

Kubernetes Architecture

One of the fundamental design decisions which have been taken by this impeccable cluster manager is its ability to deploy existing applications that run on VMs without any changes to the application code. On the high level, any application that runs on VMs can be deployed on Kubernetes by simply containerizing its components. This is achieved by its core features; container grouping, container orchestration, overlay networking, container-to-container routing with layer 4 virtual IP based routing system, service discovery, support for running daemons, deploying stateful application components, and most importantly the ability to extend the container orchestrator for supporting complex orchestration requirements.

On very high-level Kubernetes provides a set of dynamically scalable hosts for running workloads using containers and uses a set of management hosts called masters for providing an API for managing the entire container infrastructure.

That's just a glimpse of what Kubernetes provides out of the box. In the next few sections will go through its core features and explain how it can help applications to be deployed on it in no time.

Application Deployment Model

A containerized application can be deployed on Kubernetes using a deployment definition by executing a simple CLI command as follows:

kubectl run  --image= --port=

Service Discovery & Load Balancing

One of the key features of Kubernetes is its service discovery and internal routing model provided using SkyDNS and layer 4 virtual IP based routing system. These features provide internal routing for application requests using services. A set of pods created via a replica set can be load balanced using a service within the cluster network. The services get connected to pods using selector labels. Each service will get assigned a unique IP address, a hostname derived from its name and route requests among the pods in round-robin manner. The services will even provide IP-hash based routing mechanism for applications which may require session affinity. A service can define a collection of ports and the properties defined for the given service will apply to all the ports in the same way. Therefore, in a scenario where session affinity is only needed for a given port where all the other ports required to use round-robin based routing, multiple services may need to be used.

How Services Internally Work

Kubernetes services have been implemented using a component called kube-proxy. A kube-proxy instance runs in each node and provides three proxy modes: Userspace, iptables and IPVS. The current default is iptables.

In the first proxy mode: userspace, kube-proxy itself will act as a proxy server and delegate requests accepted by an iptable rule to the backend pods. In this mode, kube-proxy will operate in the userspace and will add an additional hop to the message flow.

In the second proxy mode: iptables, the kube-proxy will create a collection of iptable rules for forwarding incoming requests from the clients directly to the ports of backend pods on the network layer without adding an additional hop in the middle. This proxy mode is much faster than the first mode because of operating in the kernel space and not adding an additional proxy server in the middle.

Internal/External Routing Separation

Kubernetes services can be exposed to external networks in two main ways. The first is using node ports by exposing dynamic ports on the nodes that forward traffic to the service ports. The second is using a load balancer configured via an ingress controller which can delegate requests to the services by connecting to the same overlay network. An ingress controller is a background process which may run in a container which listens to the Kubernetes API, dynamically configure and reloads a given load balancer according to a given set of ingresses. An ingress defines the routing rules based on hostnames and context paths using services.

Once an application is deployed on Kubernetes using kubectl run command, it can be exposed to the external network via a load balancer as follows:

kubectl expose deployment  --type=LoadBalancer --name=

The above command will create a service of load balancer type and map it to the pods using the same selector label created when the pods were created. As a result, depending on how the Kubernetes cluster has been configured a load balancer service on the underlying infrastructure will get created for routing requests for the given pods either via the service or directly.

Usage of Persistent Volumes

Applications that require persisting data on the filesystem may use volumes for mounting storage devices to ephemeral containers similar to how volumes are used with VMs. Kubernetes has properly designed this concept by loosely coupling physical storage devices with containers by introducing an intermediate resource called persistent volume claims (PVCs). A PVC defines the disk size, disk type (ReadWriteOnce, ReadOnlyMany, ReadWriteMany) and dynamically links a storage device to a volume defined against a pod. The binding process can either be done in a static way using PVs or dynamically by using a persistent storage provider. In both approaches, a volume will get linked to a PV one to one and depend on the configuration given data will be preserved even if the pods get terminated. According to the disk type used multiple pods will be able to connect to the same disk and read/write.

Deploying Daemons on Nodes

Kubernetes provides a resource called DaemonSets for running a copy of a pod in each Kubernetes node as a daemon. Some of the use cases of DaemonSets are as follows:

  • A cluster storage daemon such as glusterd , ceph to be deployed on each node for providing persistence storage.

  • A log collection daemon such as fluentd or logstash to be run on every node for collecting container and Kubernetes component logs.

  • An ingress controller pod to be run on a collection of nodes for providing external routing.

Deploying Stateful Distributed Systems

One of the most difficult tasks of containerizing applications is the process of designing the deployment architecture of stateful distributed components. Stateless components can be easily containerized as they may not have a predefined startup sequence, clustering requirements, point to point TCP connections, unique network identifiers, graceful startup and termination requirements, etc. Systems such as databases, big data analysis systems, distributed key/value stores, and message brokers, may have complex distributed architectures that may require above features. Kubernetes introduced StatefulSets resource for supporting such complex requirements.

On high-level StatefulSets are similar to ReplicaSets except that it provides the ability to handle the startup sequence of pods, uniquely identify each pod for preserving its state while providing the following characteristics:

  • Stable, unique network identifiers.

  • Stable, persistent storage.

  • Ordered, graceful deployment and scaling.

  • Ordered, graceful deletion and termination.

  • Ordered, automated rolling updates

Running Background Jobs

Deploying Databases

Configurations Management

Containers generally use environment variables for parameterizing their runtime configurations. However, typical enterprise applications use a considerable amount of configuration files for providing static configurations required for a given deployment. Kubernetes provides a fabulous way of managing such configuration files using a simple resource called ConfigMaps without bundling them into the container images. ConfigMaps can be created using directories, files or literal values using following CLI command:

kubectl create configmap  # map-name: name of the config map
# data-source: directory, file or literal value

Once a ConfigMap is created, it can be mounted to a pod using a volume mount. With this loosely coupled architecture, configurations of an already running system can be updated seamlessly just by updating the relevant ConfigMap and executing a rolling update process which I will explain in one of the next sections. I might be important to note that currently ConfigMaps does not support nested folders, therefore if there are configuration files available in a nested directory structure of the application, a ConfigMap would need to be created for each directory level.

Credentials Management

Similar to ConfigMaps Kubernetes provides another valuable resource called Secrets for managing sensitive information such as passwords, OAuth tokens, and ssh keys. Otherwise updating that information on an already running system might require rebuilding the container images.

A secret can be created for managing basic auth credentials using the following way:

# write credentials to two files
$ echo -n 'admin' > ./username.txt
$ echo -n '1f2d1e2e67df' > ./password.txt# create a secret
$ kubectl create secret generic app-credentials --from-file=./username.txt --from-file=./password.txt

Once a secret is created, it can be read by a pod either using environment variables or volume mounts. Similarly, any other type of sensitive information can be injected into pods using the same approach.

Rolling Out Updates

The above animated-image illustrates how application updates can be rolled out for an already running application using blue/green deployment method without having to take a system downtime. This is another invaluable feature of Kubernetes which allows applications to seamlessly roll out security updates and backwards compatible changes without much effort. If the changes are not backwards compatible, a manual blue/green deployment might need to be executed using a separate deployment definition.

This approach allows a rollout to be executed for updating a container image using a simple CLI command:

$ kubectl set image deployment/ =:

Once a rollout is executed, the status of the rollout process can be checked as follows:

$ kubectl rollout status deployment/
Waiting for rollout to finish: 2 out of 3 new replicas have been updated...
deployment "" successfully rolled out

Using the same CLI command kubectl set image deployment an update can be rolled back to a previous state.

Autoscaling

Figure 10: Kubernetes Pod Autoscaling Model

Kubernetes allows pods to be manually scaled either using ReplicaSets or Deployments. The following CLI command can be used for this purpose:

kubectl scale --replicas= deployment/

Package Management

Figure 11: Helm and Kubeapps Hub

The Kubernetes community initiated a separate project for implementing a package manager for Kubernetes called Helm. This allows Kubernetes resources such as deployments, services, config maps, ingresses, etc to be templated and packaged using a resource called chart and allow them to be configured at the installation time using input parameters. More importantly, it allows existing charts to be reused when implementing installation packages using dependencies. Helm repositories can be hosted in public and private cloud environments for managing application charts. Helm provides a CLI for installing applications from a given Helm repository into a selected Kubernetes environment.

Conclusion

Kubernetes has been designed with over a decade of experience on running containerized applications at scale at Google. It has been already adopted by the largest public cloud vendors, technology providers and currently being embraced by most of the software vendors and enterprises as this article is written. It has even lead to the inception of the Cloud Native Computing Foundation (CNCF) in the year 2015, was the first project to graduate under CNCF, and started streamlining the container ecosystem together with other container-related projects such as CNI, Containers, Envoy, Fluentd, gRPC, Jagger, Linkerd, Prometheus, RKT and Vitess. The key reasons for its popularity and to be endorsed at such level might be its flawless design, collaborations with industry leaders, making it open-source, always being open to ideas and contributions.

References

The above figure illustrates the high-level application deployment model on Kubernetes. It uses a resource called for orchestrating containers. A ReplicaSet can be considered as a YAML or a JSON based metadata file which defines the container images, ports, the number of replicas, activation health checks, liveness health checks, environment variables, volume mounts, security rules, etc required for creating and managing the containers. Containers are always created on Kubernetes as groups called which is again a Kubernetes metadata definition or a resource. Each pod allows sharing the file system, network interfaces, operating system users, etc among the containers using Linux namespaces, cgroups, and other kernel features. The ReplicaSets can be managed by another high-level resource called for providing features for rolling out updates and handling their rollbacks.

The third proxy mode was which is much similar to the second proxy mode and it makes use of an based virtual server for routing requests without using iptable rules. IPVS is a transport layer load balancing feature which is available in the Linux kernel based on Netfilter and provides a collection of load balancing algorithms. The main reason for using IPVS over iptables is the performance overhead of syncing proxy rules when using iptables. When thousands of services are created, updating iptable rules takes a considerable amount of time compared to a few milliseconds with IPVS. Moreover, IPVS uses a hash table for looking up the proxy rules over sequential scans with iptables. More information on the introduction of IPVS proxy mode can be found in “” presentation done by Huawei at KubeCon 2017.

Disks that support ReadWriteOnce will only be able to connect to a single pod and will not be able to share among multiple pods at the same time. However, disks that support ReadOnlyMany will be able to share among multiple pods at the same time in read-only mode. In contrast, as the name implies disks with ReadWriteMany support can be connected to multiple pods for sharing data in read and write mode. Kubernetes provides for supporting storage services available on public cloud platforms such as AWS EBS, GCE Persistent Disk, Azure File, Azure Disk and many other well-known storage systems such as NFS, Glusterfs, iSCSI, Cinder, etc.

A node monitoring daemon such as to be run on every node for monitoring the container hosts.

In the above, stable refers to preserving the network identifiers and persistent storage across pod rescheduling. Unique network identifiers are provided by using headless services as shown in the above figure. Kubernetes has provided examples of StatefulSets for deploying , and in a distributed manner.

In addition to ReplicaSets and StatefulSets Kubernetes provides two additional controllers for running workloads in the background called and . The difference between Jobs and CronJobs is that Jobs execute once and terminates whereas CronJobs get executed periodically by a given time interval similar to standard Linux cron jobs.

Deploying databases on container platforms for production usage would be a slightly difficult task than deploying applications due to their requirements for clustering, point to point connections, replication, shading, managing backups, etc. As mentioned previously StatefulSets has been designed specifically for supporting such complex requirements and there are a couple of options for running , and clusters on Kubernetes today. YouTube’s database clustering system which is now a CNCF project would be a great option for running MySQL at scale on Kubernetes with shading. By saying that it would be better to note that those options are still at very early stages and if an existing production-grade database system is available on the given infrastructure such as RDS on AWS, Cloud SQL on GCP, or on-premise database cluster it might be better to choose one of those options considering the installation complexity and maintenance overhead.

Figure 9: Kubernetes Rolling Update Process
Image for post

As shown in the above figure this functionality can be extended by adding another resource called against a deployment for dynamically scaling the pods based on their actual resource usage. The HPA will monitor the resource usage of each pod via the resource metrics API and inform the deployment to change the replica count of the ReplicaSet accordingly. Kubernetes uses an upscale delay and a downscale delay for avoiding thrashing which could occur due to frequent resource usage fluctuations in some situations. Currently, HPA only provides support for scaling based on CPU usage. If needed custom metrics can also be plugged in via the depending on the nature of the application.

Image for post
Image for post

A wide range of stable Helm charts for well-known software applications can be found in it’s and also in the central Helm server: .

[1] What is Kubernetes:

[2] Borg, Omega and Kubernetes:

[3] Kubernetes Components:

[4] Kubernetes Services:

[5] IPVS (IP Virtual Server)

[6] Introduction of IPVS Proxy Mode:

[7] Kubernetes Persistent Volumes:

[8] Kubernetes Configuration Best Practices:

[9] Customer Resources & Custom Controllers:

[10] Understanding Vitess:

[11] Skaffold, CI/CD for Kubernetes:

[12] Kaniko, Build Container Images in Kubernetes:

[13] Apache Spark 2.3 with Native Kubernetes Support

[14] Deploying Apache Kafka using StatefulSets:

[15] Deploying Apache Zookeeper using StatefulSets:

ReplicaSet
Pods
Deployments
added in Kubernetes v1.8
IPVS
Scaling Kubernetes to Support 50,000 Services
a collection of volume plugins
Prometheus Node Exporter
Cassandra
Zookeeper
Jobs
CronJobs
PostgreSQL
MongoDB
Vitess
Horizontal Pod Autoscaler (HPA)
Custom Metrics API
Github repository
Kubeapps Hub
https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
https://ai.google/research/pubs/pub44843
https://kubernetes.io/docs/concepts/overview/components/
https://kubernetes.io/docs/concepts/services-networking/service/
http://www.linuxvirtualserver.org/software/ipvs.html
https://github.com/kubernetes/kubernetes/issues/44063
https://kubernetes.io/docs/concepts/storage/persistent-volumes/
https://kubernetes.io/docs/concepts/configuration/overview/
https://kubernetes.io/docs/concepts/api-extension/custom-resources/
https://vitess.io/overview/
https://github.com/GoogleContainerTools/skaffold
https://github.com/GoogleContainerTools/kaniko
https://kubernetes.io/blog/2018/03/apache-spark-23-with-native-kubernetes/
https://github.com/kubernetes/contrib/tree/master/statefulsets/kafka
https://github.com/kubernetes/contrib/tree/master/statefulsets/zookeeper
more than a decade of experience of running production workloads at Google
AWS
Google Cloud
Azure
IBM Cloud
Oracle Cloud
Figure 1: Kubernetes Architecture
Figure 2: Kubernetes Application Deployment Model
Figure 3: Kubernetes Service Discovery & Load Balancing Model
Figure 4: Kubernetes Service Proxy Modes (Userspace, iptables, & ipvs)
Figure 5: Kubernetes Internal/External Routing Separation
Figure 6: Kubernetes Persistent Volume Binding Models
Figure 7: Deploying Daemons on Kubernetes Nodes
Figure 8: Stateful Component Deployment Model