Kubernetes Monitoring with Prometheus

542

Kubernetes makes management of complex environments easy, but to ensure availability it’s crucial to have operational insight into the Kubernetes components and all applications running on the cluster. I believe monitoring is the backbone of a good production environment.

Applications running in containers and orchestrated by Kubernetes are highly automated and dynamic, and so, when it comes to monitoring applications in these environments, traditional server-based monitoring tools designed for static services are not sufficient.

This is where Prometheus comes in. Prometheus is an open-source systems monitoring and alerting toolkit. It’s written in Go, open source, and is incubated under the Cloud Native Computing Foundation. Prometheus has rapidly gained popularity for infrastructure and application monitoring. It was built and designed specially to monitor microservices that run in containers. 

To monitor services using Prometheus, services need to expose a Prometheus endpoint itself or via plugins called exporters . This endpoint is an HTTP interface that exposes a list of metrics and the current value of the metrics. Data is scraped from running services at time intervals and saved to a time-series database where it can be queried via the PromQL language. Because the data is stored as a time series, it allows you to explore those time intervals to diagnose problems when they occurred and to also analyze long-term monitoring trends with your infrastructure — two awesomely powerful features of Prometheus.

Prometheus uses service discovery, which is nicely integrated with Kubernetes, to find all your services. Once it has found all services, it will gather metrics for all those services by polling their Prometheus metrics endpoint. The strong points for the pull approach are that there is no need to install an agent and that the metrics can be pulled by multiple Prometheus instances.

It has a powerful query language(PromQL) to inspect that database, create alerts, and plot basic graphs. Those graphs can then be used to detect anomalies or trends for (possibly automated) resource provisioning. In the Prometheus UI, you can write queries in the PromQL language to extract metric information.

The platform also includes client libraries and a series of exporters for specific functions and components, see at https://prometheus.io/docs/instrumenting/exporters/

Prometheus also included an alert manager, which can handle alerts generated from thresholds or triggers on time series data collected by Prometheus. Alerts can be configured in the alertmanager, again using the PromQL language. 

Most Prometheus deployments generally use Grafana to render the results using custom-built dashboards. This gives much better results, and allows graphing multiple machines separately. By adding Grafana to Prometheus as a visualization layer, you can easily set up a monitoring stack for your monitoring stack.

To deploy prometheus on k8s cluster, follow steps mentioned at https://github.com/pawankkamboj/k8s-custom-metrics.git

It also included a prometheus-adaptor, you can skip it.