Exercise 6: Scaling

In this exercise, you’ll implement and test various scaling mechanisms in your AKS cluster, including Horizontal Pod Autoscaling (HPA), Cluster Autoscaler, and learn how to optimize resource usage and performance.

Task 1: Implementing Horizontal Pod Autoscaler (HPA)

The Horizontal Pod Autoscaler automatically scales the number of pods in a deployment or stateful set based on observed CPU utilization or other metrics.

  1. Deploy a PHP Apache server with the apache image that includes a CPU-intensive endpoint:

    kubectl create deployment php-apache --image=k8s.gcr.io/hpa-example
    kubectl set resources deployment php-apache --requests=cpu=200m --limits=cpu=500m
    kubectl expose deployment php-apache --port=80
    kubectl create deployment php-apache --image=k8s.gcr.io/hpa-example
    kubectl set resources deployment php-apache --requests=cpu=200m --limits=cpu=500m
    kubectl expose deployment php-apache --port=80
  2. Create a Horizontal Pod Autoscaler:

    kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
    kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
  3. Check the status of the HPA:

    kubectl get hpa
    kubectl get hpa
  4. You should see something like this:

    HPA HPA

    This indicates that the HPA is monitoring the php-apache deployment and will scale it based on CPU usage.

  5. Generate load on the service:

    kubectl run load-generator --image=busybox --rm -it -- /bin/sh -c "while true; do wget -q -O- http://php-apache; done"
    kubectl run load-generator --image=busybox --rm -it -- /bin/sh -c "while true; do wget -q -O- http://php-apache; done"
  6. Open another terminal and watch the HPA scale the pods:

    kubectl get hpa php-apache --watch
    kubectl get hpa php-apache --watch
  7. You should see the CPU load increase and the number of replicas increase accordingly.

  8. To stop the load, go back to the terminal running the load generator and press Ctrl+C.

  9. Watch the HPA scale back down over the next few minutes:

    kubectl get hpa php-apache --watch
    kubectl get pods
    kubectl get hpa php-apache --watch
    kubectl get pods

Task 2: Implementing Cluster Autoscaler with a Dedicated Node Pool

Our current system node pool is set to manual scaling, we’ll create a dedicated node pool with autoscaling enabled.

  1. Create a new node pool with autoscaling enabled:

    $NODEPOOL_NAME = "scalepool"
    
    az aks nodepool add `
      --resource-group $RESOURCE_GROUP `
      --cluster-name $AKS_NAME `
      --name $NODEPOOL_NAME `
      --node-count 1 `
      --enable-cluster-autoscaler `
      --min-count 1 `
      --max-count 3 `
      --node-vm-size $VM_SKU `
      --labels purpose=autoscale
    NODEPOOL_NAME="scalepool"
    
    az aks nodepool add \
      --resource-group $RESOURCE_GROUP \
      --cluster-name $AKS_NAME \
      --name $NODEPOOL_NAME \
      --node-count 1 \
      --enable-cluster-autoscaler \
      --min-count 1 \
      --max-count 3 \
      --node-vm-size $VM_SKU \
      --labels purpose=autoscale
  2. Wait for the node pool to be ready:

    az aks nodepool show `
      --resource-group $RESOURCE_GROUP `
      --cluster-name $AKS_NAME `
      --name $NODEPOOL_NAME `
      --query provisioningState -o tsv
    az aks nodepool show \
      --resource-group $RESOURCE_GROUP \
      --cluster-name $AKS_NAME \
      --name $NODEPOOL_NAME \
      --query provisioningState -o tsv

    Wait until the output shows “Succeeded” before proceeding.

  3. Validdate the new node pool shows up, with a single node currently deployed:

    kubectl get nodes
    kubectl get nodes
  4. Deploy a resource-intensive application that will target our new node pool:

    $resourceConsumerYaml = @"
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: resource-consumer
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: resource-consumer
      template:
        metadata:
          labels:
            app: resource-consumer
        spec:
          nodeSelector:
            purpose: autoscale
          containers:
          - name: resource-consumer
            image: k8sonazureworkshoppublic.azurecr.io/k8s.gcr.io/pause:3.1
            resources:
              requests:
                cpu: 500m
                memory: 512Mi
              limits:
                cpu: 1000m
                memory: 1Gi
    "@
    
    $resourceConsumerYaml | kubectl apply -f -
    cat << EOF | kubectl apply -f -
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: resource-consumer
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: resource-consumer
      template:
        metadata:
          labels:
            app: resource-consumer
        spec:
          nodeSelector:
            purpose: autoscale
          containers:
          - name: resource-consumer
            image: k8sonazureworkshoppublic.azurecr.io/k8s.gcr.io/pause:3.1
            resources:
              requests:
                cpu: 500m
                memory: 512Mi
              limits:
                cpu: 1000m
                memory: 1Gi
    EOF
  5. Verify that the pods are running on our new node pool:

    kubectl get pods -l app=resource-consumer -o wide
    kubectl get pods -l app=resource-consumer -o wide

    You should see the pods scheduled on nodes from the autoscale node pool.

  6. Scale the deployment to trigger node autoscaling:

    kubectl scale deployment resource-consumer --replicas=10
    kubectl scale deployment resource-consumer --replicas=10
  7. Monitor the nodes to see the Cluster Autoscaler add new nodes:

    kubectl get nodes -l purpose=autoscale --watch
    kubectl get nodes -l purpose=autoscale --watch

    It may take 3-5 minutes for new nodes to be added. You should eventually see additional nodes with the label purpose=autoscale.

  8. You can also view the status of the node pool in the Azure portal under your AKS cluster’s “Node pools” section. It should show the target node count increasing and that it’s scaling up.

    Scale Up Scale Up

  9. Check the status of your pods and which nodes they’re running on:

    kubectl get pods -l app=resource-consumer -o wide
    kubectl get pods -l app=resource-consumer -o wide
  10. Scale down the deployment to observe the Cluster Autoscaler removing nodes:

    kubectl scale deployment resource-consumer --replicas=1
    kubectl scale deployment resource-consumer --replicas=1
  11. The nodes that are no longer needed should be removed automatically by the Cluster Autoscaler. This can take 10-15 minutes due to the default node deletion delay. You can monitor the process:

    kubectl get nodes -l purpose=autoscale
    kubectl get nodes -l purpose=autoscale

Task 3: Clean Up

  1. Delete the resource consumer deployment:

    kubectl delete deployment resource-consumer
    kubectl delete deployment resource-consumer
  2. Delete the autoscale node pool:

    az aks nodepool delete `
      --resource-group $RESOURCE_GROUP `
      --cluster-name $AKS_NAME `
      --name $NODEPOOL_NAME
    az aks nodepool delete \
      --resource-group $RESOURCE_GROUP \
      --cluster-name $AKS_NAME \
      --name $NODEPOOL_NAME