hybrid-cloud

Lessons Learned from Selling Kubernetes

Posted on Updated on

A picture of a man walking down a path on a moonlight night. It is foggy and there are many puzzle pieces floating in front of him, representing the challenges of business problems.
Image created by DALL-E on Chat GPT 4

Cloud-native, containerization, microservices, and Kubernetes have become very popular over the past few years. They are as complex as they are powerful, and for a large, complex organization, these technologies can be a game changer. Kubernetes itself is a partial solution – the foundation for something extraordinary. It can take 20-25 additional products to handle all aspects of the computing environment (e.g., ingress, services mesh, storage, networking, security, observability, continuous delivery, policy management, and more).

Consider the case of a major Financial Services company, one of my clients. They operated with 200 Development teams, each comprising 5-10 members, who were frequently tasked with deploying new applications and application changes. Prior to embracing Kubernetes, their approach involved deploying massive monolithic applications, with changes occurring only 2-3 times per year. However, with the introduction of Kubernetes, they were able to transition to a daily deployment model, maintaining control and swiftly rolling back changes if necessary. This shift in their operations not only allowed them to innovate at a faster pace but also enabled them to respond to opportunities and address needs more promptly.

Most platforms utilize Ansible and Terraform for creating playbooks, configuration management, and other purposes. Those configurations could become very long and complex over time and were prone to errors. For more complicated configurations, such as multi-cloud and hybrid environments, the complexity is further amplified. “Configuration Drift,” or runtime configurations that differ from what was expected for various reasons, leads to problems such as increased costs due to resource misconfiguration, potential security issues resulting from incorrectly applied or missing policies, and issues with identity management.

The surprising thing was that when prospects identified those problems, they would look to new platforms that used the same tools to solve them. Sometimes, things would temporarily improve (after much time and expense for a migration), but then fall back into disarray as the underlying process issues still needed to be addressed.

Our platform used a new technology called Cluster API (or CAPI). It provided a central (declarative) configuration repository, making it quick and easy to create new clusters. More importantly, it would perform regular configuration checks and automatically reconcile incorrect or missing policies. It was an immutable and self-healing Kubernetes infrastructure. It simplified overall cluster management and standardized infrastructure implementation. 

All great stuff – who would not want that? This technology was new but proven, but it was different, which scared some people. These were a couple of recurring themes:

  1. The Platform and DevOps teams had a backlog of work due to existing problems, so there was more fear about falling further behind than confidence in a better alternative.
  2. Teams focused on their existing investment in a platform or on the sunk costs spent over a long period, attempting to solve their problems. The ROI on a new platform was often only 3-4 months, but that was challenging to believe, given their own experiences on an inferior platform.
  3. Teams would look at outsourcing the problem to a managed service provider. They could not explain how the problems would be specifically resolved, but did not seem concerned about that lack of clarity.
  4. There was a lack of consistency on the versions of Kubernetes used, the add-ons and their versions, and one-off changes that were never intended to become permanent. Reconciling those issues or migrating to new, clean clusters both involved time and effort. That became an excuse to maintain the status quo.
  5. Unplanned outages were common and usually expensive. Using the cost of those outages as justification for something new was typically a last resort, as people did not like acknowledging problems that put a spotlight on themselves.
  6. Architects had a curiosity about new and different things, but often lacked the gravitas within business leadership to effect change. They were usually unwilling or unable to explain how real changes happen within their company, or introduce you to the actual decision-makers and budget holders.

Focusing on outcomes and working with the Executives most affected by them tended to be the best path forward. Those companies and teams were rewarded with a platform that simplified fleet management, improved observability, and helped them avoid the risky, expensive problems that had plagued them in the past. And, working with satisfied customers who appreciated your efforts and became loyal partners made selling this platform that much more rewarding.