Collecting Kubernetes pod metrics (or: Heapster is dead!)

Heapster is dead!

Monitoring the nodes and pods of your Kubernetes cluster isn’t an easy task. There are multiple tools and ways of doing so and, as with everything k8s, the best practices and common methods change so rapidly, it’s almost impossible to keep up with.

Take Heapster for example, a super popular add-on to gather metrics from your pods and the go-to tool up until a year ago, is now DEPRECATED. There are many good reasons for discontinuing Heaspter, but it doesn’t make it easier for us DevOps’ers to keep track of and constantly reconsider our tooling.

Fortunately, it seems like we’re about to get some stability with the introduction of the Resource Metrics API. Starting from Kubernetes 1.8, resource usage metrics, such as container CPU and memory usage, are available in Kubernetes through the Metrics API. These metrics can be accessed by the Metrics Server or by 3rd party tools such as Prometheus. Users can fetch the data using commands such as `kubectl top`, but much more importantly, many services can use this data in order to make smart decisions, HPA (Horizontal Pods Auto-scaler) for example.

Collecting k8s metrics

In this blog, we’ll setup the Metrics Server, gather CPU metrics from our pods and deploy HPA that will scale based on these metrics. Under the hood, we’ll use Spotinst Ocean to automatically  launch the best instance to fit the needs of the pending pod based on its resource requirements.

With Spotinst Ocean, you can create a Kubernetes cluster with multiple machine types and sizes out of the box. Ocean uses Tetris scaling and bin-packing algorithms that are driven by container resource requirements rather than node utilization, thus allowing the cluster to be highly utilized. On top of that, Ocean keeps a dynamic “headroom” for the cluster, that ensures capacity is available for new workloads at any time. This way you don’t have to wait for a new instance to launch when creating a new deployment.

Step-by-step guide

(This guide assumes you have Kubernetes cluster 1.8+ in place, and access to it via kubectl)

Let’s get started!

First, Install Kubernetes Metrics Server:

Clone Metrics-server git repository:

git clone https://github.com/kubernetes-incubator/metrics-server.git

Deploy the metrics server:

kubectl apply -f metrics-server/deploy/1.8+/

Let’s see some metrics:

Good!
Now let’s make some good use of those metrics.

First, let’s deploy a simple Apache container (based on https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/)

kubectl run php-apache --image=k8s.gcr.io/hpa-example --requests=cpu=200m --expose --port=80

Make sure the pod is running successfully by running kubectl get pods:

Now lets put HPA in place.  In the example below, HPA will maintain 50% CPU across our pods, and will change the amount between 1-10 pods:

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10

In the next step, let’s generate some load on the Apache, in order to see HPA in action. In addition, we’ll see Ocean scaling up our infrastructure as our current nodes get closer to their limit:

We’ll use load-generator to…well, generate some load.

Open an additional terminal window and run:

kubectl run -i --tty load-generator --image=busybox /bin/sh

Hit enter.

while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done

After a couple of minutes we will notice our HPA kicking in:


Here, CPU consumption has increased to 488%, soon HPA will try and meet the target of 50%

We can see that HPA decided to launch 4 additional pods that were successfully scheduled to the cluster.

As the load on the containers proceed, an additional 4 pods were launched but because there weren’t enough resources in any of the existing nodes to satisfy the pods’ requirements, they went into a pending state.

No need to worry, Ocean will rescue us!

Maximum Availability with Ocean

Immediately after recognizing the need for additional resources, Spotinst will evaluate the situation and launch the most appropriate instance for the task:

Here we can see how the decision to launch m5.large instance was made:


In the “pre-scale” and “post scale” sections we can see the number of resources in the cluster prior to and after the scaling operation. below in “pending deployment pods” the number of resources needed for the pending pods and right after, the instance that was chosen to be launched, and amount of resources it will add to the cluster.

The actual m5.large instance that launched in Spotinst Ocean’s nodes page:

And voilà, all our pods are successfully scheduled!

In this blog we installed Kubernetes Metrics-server, collected some metrics for the metrics-api and put it all to some good use with HPA.

Moreover, we proved the amazing potential of blending intelligent pod scaling driven by HPA and the smart infrastructure decision-making powered by Spotinst Ocean.

How do you monitor your pods? let us know on Twitter @Spotinst !