I'm one of the kops authors, and I will say that a lot of people run k8s clusters created by kops in production - I don't want to name others, but feel free to join sig-aws or kops channels in the kubernetes slack and ask there and I'm sure you'll get lots of reports. In general kops makes it very easy to get a production-suitable cluster; there shouldn't be any manual work required other than occasionally updating kops & kubernetes (which kops makes an easy process).
But: we don't currently support rotating certificates. There used to be a bug in kubernetes which made "live" certificate rotation impossible, but that bug has now been fixed so it's probably time to revisit it. We create 10 year CA certificates, so it isn't something that you have to do other than just good security practice though.
There's no need to choose: kops uses kubeadm (not a lot of it, but more with each release), so choose kops and get kubeadm for free!
kubeadm is intended to be a building block that any installation tool can leverage, rather than each building the same low-level functionality. It isn't primarily meant for end-users, unless you want to build your own installation tool.
We want to accommodate everyone in kops, but there is a trade-off between making things easy vs. being entirely flexible, so there will always be people who can't use kops. You should absolutely use kubeadm if you're building your own installation tool - whether you're sharing it with the world or just within your company. luxas (the primary kubeadm author) does an amazing job.
Thanks, I wasn't aware that it was leveraging kubeadm. This is good to know. I have been really impressed by my limited exposure to Kops so far. Cheers!
How do you handle that kubernetes requires the eth0 ip in no_proxy? Do you set that automatically?
How do you handle that DNS in a corp net can get weird and for instance in Ubuntu 16.04 the NetworkManager setting for dnsmasq needs to be deactivated?
How do you report dying nodes due to kernel version and docker version not being similar?
Do you report why pods are pending?
Does kops wait until a sucessful health check before it reports a successful deployment (in contrast to helm which reports success when the docker image isn't even finished pulling)?
Do you run any metrics on the cluster to see if everything is working fine?
Edit: Sorry to disturb the kops marketing effort, but some people still hope for a real, enterprise ready solution for k8s instead of just another fluff added on a shaky foundation.
kops is an open source project that is part of the kubernetes project, we're all working to solve these things as best we can. Some of these issues are not best solved in kops; for example we don't try to force a particular monitoring system on you. That said I'm also a kubernetes contributor so I'll try to quickly answer:
* no_proxy - kops is getting support for servers that use http_proxy, but I think your issue is a client issue with kubectl proxy and it looks like it is being investigated in #45956. I retagged (what I think are) the right folks.
* DNS, docker version/kernel version: if you let kops it'll configure the AMI / kernel, docker, DNS, sysctls, everything. So in that scenario everything should just work, because kops controls everything. Obviously things can still go wrong, but I'm much more able to support or diagnose problems with a kops configuration where most things are set correctly, than a general scenario.
* why pods are pending: `kubectl describe pod` shows you why. Your "preferred alerting system" could be more proactive though.
* metrics are probably best handled by a monitoring system, and you should install your preferred system after kops installs the cluster. We try to only install things in kops that are required to get to the kubectl "boot prompt". Lots of options here: prometheus, sysdig, datadog, weave scope, newrelic etc.
* does kops wait for readiness: actually not by default - and this does cause problems. For example, if you hit your AWS instance quota, your kops cluster will silenty never come up. Similarly if your chosen instance type isn't available in your AZ. We have a fix for the latter and are working on the former. We have `kops validate` which will wait, but it's still too hard when something goes wrong - definitely room for improvement here.
In general though - where there are things you think we could do better, do open an issue on kops (or kubernetes if it's more of a kubernetes issue)!
Nice, thanks. My feeling is that this is about 75% of what we want and thereby may really be the best solution there is, right now. I'll bring your responses into my next team meeting.
Thanks for feedback. I agree that a huge wall of text is not desired. I think a single sentence answer is fine.
For instance: "Yes, we can. We considered most of that and also have some enterprise customers with similar setups. Check out "googleterm A", "googleterm B", "googleterm C". If you don't find all of that join our slack chat to get more details."
And a more likely answer, also single line: "WTF are these questions? We thought docker+k8s already solves that." (I would've also expected solutions from there but don't hope for it anymore.)
PS (actually an edit to the previous post, but it's already too old): For instance Openshift, as I just found, considers the docker-version kernel-version problem via "xxx-excluder" meta packages: https://docs.openshift.com/container-platform/3.4/install_co...
But: we don't currently support rotating certificates. There used to be a bug in kubernetes which made "live" certificate rotation impossible, but that bug has now been fixed so it's probably time to revisit it. We create 10 year CA certificates, so it isn't something that you have to do other than just good security practice though.
If you file an issue (https://github.com/kubernetes/kops/issues) for certificate rotation and any other gaps / questions we'll get to them!