Introduction to FinOps for Kubernetes: Challenges and Best Practices

Published in

Replex

3 min readJun 17, 2021

Before the cloud there was the data center. Private data centers represented a fixed cost, capital expense model with enterprises adding or trimming hardware in quarterly or even yearly instalments. Enterprises also had pretty straightforward procurement, cost management, forecasting and cost allocation processes in place.

All was good with the world.

Then came the cloud with its on-demand and highly variable consumption model. The cloud necessitated an update in the financial reporting mechanisms, procurement and cost management processes used by enterprises for their legacy on-premises infrastructure.

Over time these processes coalesced into a comprehensive FinOps framework bundling together best practices, cultural paradigms and tooling.

The introduction of Kubernetes threw another spanner in the works. With its ephemeral, hyper scalable and shared resources model, Kubernetes now requires a similar update in FinOps processes and practices. This update is essential to ensure that FinOps principles and practices are better aligned with modern cloud native Kubernetes environments.

In this article we will review some of the core challenges associated with implementing FinOps processes for cloud native Kubernetes environments and how to overcome them.

So let’s get started.

Kubernetes and Container Cost Allocation

Allocating costs to teams, projects and business units is essential to a good FinOps practice. Cost allocation done right unlocks more complex showback, chargeback and benchmarking mechanisms for enterprises. It also serves as the basis for cost management as well as budget reviews and allocation.

On the Kubernetes layer however, costs first have to be allocated to native Kubernetes artefacts like containers, before they can be allocated to teams or projects etc. This is problematic due to a number of reasons

Accurately Measuring Resource Consumption

Containers represent another layer running on top of cloud instances called nodes. At any given time multiple containers run on any given node. Accurately allocating the costs of the underlying cloud instance to the containers running on top requires figuring out the exact resource consumption of each individual container.

The amount of resources consumed are also dependent on how long the container runs for. Since containers and pods don’t usually run for long periods of time, tracking and accurately reflecting the amount of time that a container runs for is essential for accurate cost allocation.

Kubernetes Environments are Ephemeral and Dynamic

The ephemeral nature of containers and pods adds another layer of complexity to Kubernetes cost allocation. Production Kubernetes environments are highly dynamic with containers and pods popping in and out of existence on a daily basis. These containers can be rescheduled on any node across the cluster, which in turn can span across cloud provider availability zones or regions. All of these rescheduling events, whether on a different instance type, availability zone or region, will result in substantially different costs for the same container.

Allocating Costs for Cloud Provider Managed Services

Most modern cloud native applications deployed on Kubernetes leverage at least some cloud provider managed services like RDS, S3 or Lambda. Accurately allocating costs to these applications would entail allocating out-of-cluster costs like those for S3 or RDS.

Allocating Satellite Costs

Production Kubernetes environments also throw up multiple other costs. Depending on the use-case, organizations might encounter costs for operations or management overhead, networking or storage costs, and backups or software licenses.

Aligning Tagging and Kubernetes Labelling

Tagging resources on public cloud providers is a pretty straightforward exercise. FinOps teams in consultation with engineering and other stakeholders identify relevant cost centers and develop a comprehensive tagging strategy. This tagging strategy is then applied by DevOps and engineering to all newly provisioned cloud resources. Finally integrating this tagging setup with a reporting tool allows for easy mapping of costs to cost centers.

With the shared resources model of Kubernetes where multiple teams use the same clusters to run their applications, more complex cost allocation use-cases are often difficult to implement.

Kubernetes labels allows DevOps to tackle some of these use-cases by assigning key-value pairs to native Kubernetes artefacts. FinOps teams working with cloud native Kubernetes environments need to ensure that the already existing tagging regime aligns with the labels used by engineering and Devops for native Kubernetes artefacts.

Interested in learing more about the FinOps framework?
Download our detailed guide to Cloud FinOps for FinOps teams, executives, DevOps, engineering, finance and procurement.

Originally published at https://www.replex.io.