OpenBSD vmm enabled

notaplumber · on Oct 13, 2016

And of course, being OpenBSD, it's privsep and sandboxed from the very beginning.. cool.

https://marc.info/?l=openbsd-cvs&m=144702840931345&w=2, https://marc.info/?l=openbsd-cvs&m=147578652321303&w=2

Man pages:

http://man.openbsd.org/vmm.4

http://man.openbsd.org/vmd.8, http://man.openbsd.org/vm.conf.5

http://man.openbsd.org/vmctl.8

rwmj · on Oct 13, 2016

QEMU/KVM on Fedora and RHEL uses seccomp, SELinux, namespaces and a separate user, and has done for about the past 5 years.

It's more interesting if they've managed to fully protect the kernel part of this (ie. vmm on OpenBSD, KVM on Linux) from the rest of the kernel.

JoachimSchipper · on Oct 13, 2016

I'm not clear - do you want to protect the vmm driver in the host from attacks started by the rest of the host kernel? Or are you referring to attacks started by the guest on the http://man.openbsd.org/virtio.4 interface?

OpenBSD's VMs seem to be pledge()d to not call most host-side kernel interfaces...

rwmj · on Oct 13, 2016

I'm referring to a mistake in the kernel module corrupting some other in-kernel structures, especially if that can be triggered from guest code.

4ad · on Oct 13, 2016

There is no privilege separation between kernel modules. They all run in the same address space with the same CPU privilege.

SELinux, seccomp, and all that protect a compromised user space thread in the host (responsible for running emulated guests, or one that provides virtual device support to guests) from other user space threads (other VMs). If there is a some problem in the KVM kernel module that stomps over memory, SELinux and seccomp can't do anything about it.

rwmj · on Oct 13, 2016

Right, exactly the point I made up-thread (https://news.ycombinator.com/item?id=12700045).

brynet · on Oct 13, 2016

OpenBSD has no kernel modules, everything compiled into the GENERIC kernel is expected to be using proper kernel interfaces, mucking around with kernel structures is a no-no.

Nothing can really prevent driver bug causing a crash, this isn't a microkernel, but it can reduce exposure from userland, i.e: pledge restrictions on syscalls/ioctls/sysctls.

snvzz · on Oct 13, 2016

However, it is still a boring type-2 hypervisor, just like bhyve, vmware or kvm; no real progress whatsoever.

I'm keeping an eye at seL4's VMM support, it looks promising, with the microkernel actually guaranteeing isolation, whereas the VMM runs with minimum privileges as a user process.

bcook · on Oct 13, 2016

No progress? Perhaps that is true from a general virtualization perspective, but this is progress from an OpenBSD (virtualization) perspective, right?

andreiw · on Oct 13, 2016

VMware ESXi is a Type 1 hypervisor.

allanjude · on Oct 14, 2016

Is it really though? It is running on top of a stripped down version of Redhat.

yellowapple · on Oct 15, 2016

No it's not. Once upon a time, ESXi (when it was still called ESX) shipped with a stripped-down Linux in a role similar to a Dom0 on Xen; nowadays, ESXi doesn't even use Linux in that capacity.

0xcde4c3db · on Oct 13, 2016

Just to get out in front of this, as someone who is a quasi-fan of OpenBSD: yes, Theo had a scathing rant years ago about virtualization being terrible [1]. But it should be noted that that was well before virtualization on x86 was mature, and in response to an assertion that the use of VMs would increase security.

[1] https://marc.info/?l=openbsd-misc&m=119318909016582

sverige · on Oct 13, 2016

I know people like to point out Theo's "rants" for some reason (that was nine years ago, for goodness' sake), but is it a rant, or is it just unvarnished truth delivered in a style that is more concerned with truth than people's precious feelings?

I've read a lot of Theo's emails in the lists, and frankly I find it refreshing that he says what he thinks.

And what I really find refreshing is that he gives a shit about whether code is correct, secure, and readable. There's a lot of lip service to those goals, but OpenBSD actually works really hard to deliver it with not a lot of money and precious little thanks.

Now, what was inaccurate about his assessment of VM from nine years ago that wasn't true? Or is it just that he seems so darn mean?

riffraff · on Oct 13, 2016

> I find it refreshing that he says what he thinks.

you can rephrase that email in a nicer way, while still expressing what you think. For example, you can drop the first paragraph, and no information is lost. You can also remove "if not stupid" and the informational content stays the same.

I am an OpenBSD fan in many ways, but you can generally express what you think while still being kind.

geocar · on Oct 13, 2016

I don't always agree. Rhetoric has a lot of forms that are useful tools in argument, and while we will not admonish someone for asking "have you stopped beating your wife yet" (or in this case, "Virtualization seems to have a lot of security benefits.") we are too quick to ignore the very valid point in response because it wasn't friendly enough. Derailing the conversation that way prevents us from dealing with what I think is the bigger issue:

When someone says something stupid like "have you stopped beating your wife" (or "Virtualization seems to have a lot of security benefits"), we may need to tell them it is stupid, because stupidity has this way of spreading when it sounds nice and helpful (yes, beating your wife is bad; yes security benefits are good), but it's still a stupid statement.

Virtualization is extremely popular, but it isn't secure, and it's actually (and when you're thinking clearly, obviously) less secure than other, existing systems. Security has to be a complete holistic effort, and not an abstraction layer, which is something most people in our industry ignore. Calling out someone as stupid for saying stupid things seems to me to be the best defence, after all, you're not going to convince them that they're stupid, but you might convince someone else.

david-given · on Oct 13, 2016

This is the standard apologia for rudeness. The problem is that people don't work like that. If you say something in an aggressive manner, they will tend to assume a confrontational posture and won't work with you. It's just how people are wired.

Consider two possible responses I could have to your comment:

(a) You're full of shit.

(b) In my experience, that's not actually true.

Chances are you're going to respond better to (b) than to (a). So, if I actually want to engage you in conversation, or work with you in the future, I should say (b). Routine courtesy is part of the standard toolkit of effective communication skills. It may be cathartic to be rude to someone, but it doesn't lead to long-term progress.

> Calling out someone as stupid for saying stupid things seems to me to be the best defence...

...or you could simply explain why you think they're mistaken, without using personal insults, and so win them over to your side of the debate? As it is, you're not just driving them away, but you're also sending a message to everybody else reading the conversation that you're intolerant and difficult to work with, which is not going to help the project.

I actually remember that particular conversation, as that was the point when I gave up on OpenBSD, unsubscribed from -misc, and switched to Debian Linux (and never went back). It simply wasn't worth my time to wade through the insults and abuse to get things done any more.

Of course, in those days, Theo de Raadt had the reputation for being an angry jerk, and Linux Torvalds had a reputation for being moderate and easy to work with. How times have changed.

geocar · on Oct 13, 2016

> If you say something in an aggressive manner, they will tend to assume a confrontational posture and won't work with you.

In my experience, that's not actually true.

Hitler, famously, was able to say many things in a very aggressive manner, people did assume a confrontational posture, and did work with him.

When I see Theo write, I see someone who is passionate and who will not compromise. Yes, he does things I don't agree with, but notwithstanding the comparison to Hitler, he also does things that I do agree with.

The reason we respond the way we do isn't always because of the person we're responding to.

People respond to passion in different ways, some people shut down, some people walk away, and some people feel it's so important to find some other way to prove them wrong that they'll try to turn the conversation into one about rudeness instead.

4ad · on Oct 13, 2016

> You're full of shit.

If you said that, I would have respected you, even though we disagreed.

> In my experience, that's not actually true.

Now I know you're just PC police, and that I have nothing to gain by further engaging with you. I have no respect for you.

So no, you are wrong. People don't work like that. Or at least people worth talking to are not like that. People see through all the PC bullshit, and respond accordingly.

> It simply wasn't worth my time to wade through the insults and abuse to get things done any more.

Yeah, this kind of PC attitude it's not worth my time. People who can't, just discuss PC politics, while people who can, just do, and don't care about any of this stuff.

jerf · on Oct 13, 2016

"Now I know you're just PC police, and that I have nothing to gain by further engaging with you."

That's not PC police. This is How To Win Friends And Influence People stuff, not PC stuff.

You are free to be rude, but you'll pay the consequences, quite needlessly.

I'm in a position where I with some frequency have to contradict people (being a code reviewer for a significant internal shared library has that result), but I try to make it clear in my words and tone that it's in the spirit of working with them and obtaining the best solution. I'm pretty sure it's not 100% successful, because some people to some extent can't process being contradicted in any way as anything less than hostility (and I put both "somes" in that sentence on purpose), no matter how polite you are about it, and on my side, I'm absolutely sure I'm not perfect about it, but I am sure I'm better off than I would be if I was always being as blunt as possible without even trying.

4ad · on Oct 13, 2016

To add to that, please note one thing though. Here we are in a public space, if you (some generic you, not you david-given in particular) say or do what I consider to be a stupid thing, I will just ignore you. I will certainly not insult you. Why would I? I have better things to do.

But if you come into my house and start spitting on the floor, don't get too surprised if I punch you in the face and throw you out. Perhaps I might even insult you. You're in my house, I won't tolerate this kind of behavior. You are forcing me to act just because you are in my house.

Theo and Linus don't come in public spaces (like HN) to start insulting people. They might insult people when they come to the OpenBSD/Linux space and start spitting on the floor. To further the analogy, unannounced people come and spit on the floor.

This is a very deep and fundamental difference. And just because everyone can see and participate in the discussion, make no mistake, it's still an OpenBSD or Linux space. Not some generic fantasy land. I might hold a party and my house where everyone can come, but if you come and don't behave, I will take measures and you can whine all you want about my non PC behavior.

bonzini · on Oct 13, 2016

> Virtualization is extremely popular, but it isn't secure, and it's actually (and when you're thinking clearly, obviously) less secure than other, existing systems.

It makes me wonder why Google uses KVM to isolate multiple tenants running stuff on their machines.

Does virtualization solve all of the world's security problems? Of course not. But it does help as an additional layer of security.

geocar · on Oct 13, 2016

> But [virtualization] does help as an additional layer of security.

This is completely wrong: Instead of one big complex code surface, now you have two.

Two things that can be attacked; two things that can have bugs.

Two codebases written by people who aren't thinking about reducing bugs and aren't thinking about security.

That's insane.

The only thing that can reduce bugs (security or otherwise) is less code. DJB did a great presentation on this subject: https://cr.yp.to/qmail/qmailsec-20071101.pdf

> It makes me wonder why Google uses KVM to isolate multiple tenants running stuff on their machines.

"Google does it so it must be secure" is faulty reasoning.

There are valid non-security-related reasons, for example it simplifies administration and management tasks, and it is cheaper. When you do something because someone else does things you might find neither of you know what you're doing.

bonzini · on Oct 13, 2016

Of course it's worse than just buying a separate machine for each tenant, but that's obvious isn't it? On the other hand virtualization _does_ provide better isolation than containers (or jails or zones).

> The only thing that can reduce bugs (security or otherwise) is less code.

Bugs that let you escape the host kernel from unprivileged guest userspace (which would be the case where "two things can be attacked") are exceedingly rare.

On the other hand, increasing security is not just about reducing bugs, but also about defending in depth through redundant layers, so that if something breaks (your guest kernel) there's something else to protect you (your hypervisor). That's not the case for containers, even though they run less code.

If security were just about less code, OpenBSD-specific stuff like pledge or SOCK_DNS wouldn't have any place.

> There are valid non-security-related reasons, for example it simplifies administration and management tasks, and it is cheaper. When you do something because someone else does things you might find neither of you know what you're doing.

That's not why Google uses KVM. They only use it for Google Compute Engine.

geocar · on Oct 14, 2016

> On the other hand virtualization _does_ provide better isolation than containers (or jails or zones).

Hardware virtualiation maybe, because it usually involves less code than jails or zones.

> Bugs that let you escape the host kernel from unprivileged guest userspace (which would be the case where "two things can be attacked") are exceedingly rare.

* http://www.cvedetails.com/product/19922/Redhat-KVM.html?vend...

* https://xenbits.xen.org/xsa/

* https://www.cvedetails.com/vulnerability-list/vendor_id-93/p...

* http://www.cvedetails.com/product/27105/Linuxcontainers-LXC....

* https://www.blackhat.com/us-15/briefings.html#the-memory-sin...

Saying "exceedingly rare" without saying what you think is rare is worse than useless. It happens more than zero.

Give numbers.

Demonstrate exactly why you think lines of code is not proportional to bugs, or point to the specific arguments my named expert gives that you believe are wrong. Do not hand-wave this with a statement like "exceedingly rare".

> If security were just about less code, OpenBSD-specific stuff like pledge or SOCK_DNS wouldn't have any place.

pledge is much smaller than seLinux.

This is evidence that security is just about less code.

> On the other hand, increasing security is not just about reducing bugs, but also about defending in depth through redundant layers, so that if something breaks (your guest kernel) there's something else to protect you (your hypervisor). That's not the case for containers, even though they run less code.

Increasing security is about understanding what you're doing. More code means less understanding.

If your mail server is broken because of bugs in your mail server, or bugs in your hypervisor, then there are two codebases with an attack-surface not one.

Users don't care: If mail is delayed or destroyed, they get angry. They don't care if it was a bug in the hypervisor or a bug in the mail server, or a bug in a web server on a nearby container or a nearby virtual machine. That increased codebase meant an increased risk (demonstrated) which means more angry users.

Thinking about keeping mail server bugs in the mail server is a red herring; it just isn't important to users, so this isn't how you should evaluate "security."

bonzini · on Oct 14, 2016

> Hardware virtualiation maybe

Yes, talking about hardware virt only.

> Give numbers.

Sure. I've been working on KVM for 7 years, and I only recall two really serious vulnerabilities:

1) one bug that let you escape into the guest kernel from unprivileged guest userspace (CVE-2010-0306 and CVE-2010-0419). Even that one required relatively special circumstances, so that in practice the only exploitable user program was the X server, but this is not a bug you want to have in your hypervisor.

2) one bug that let you escape the host kernel from the guest kernel (CVE-2014-0049). Also very hard to exploit, but doable.

The "memory sinkhole" problem does not apply to virtualization.

> > If security were just about less code, OpenBSD-specific stuff like pledge or SOCK_DNS wouldn't have any place. > > pledge is much smaller than SELinux

It's still >0 lines of code. (BTW, comparing pledge and SELinux is apples and oranges).

> If your mail server is broken because of bugs in your mail server, or bugs in your hypervisor, then there are two codebases with an attack-surface not one.

If you place mail+web server on the same machine without a hypervisor, things can certainly be less secure than if you place multiple services on the same machine separated by the hypervisor. In the first case, breaking the web server results in a direct attack to the mail server. In the second case, after breaking the web server you still need to go against the smaller attack surface of the hypervisor.

> Thinking about keeping mail server bugs in the mail server is a red herring; it just isn't important to users, so this isn't how you should evaluate "security."

It's about keeping web server bugs in the web server, and not have them infect the mail server.

geocar · on Oct 15, 2016

> I only recall two really serious vulnerabilities.

Having a discriminating memory doesn't help you build programs that have no bugs. It only helps you feel better about the bugs in the programs you create. Meanwhile, real users believe DoS are serious vulnerabilities.

> If you place mail+web server on the same machine without a hypervisor, things can certainly be less secure than if you place multiple services on the same machine separated by the hypervisor.

No. It isn't certain. The hypervisor is just more code. Why should anyone believe more code is going to produce less bugs?

> The "memory sinkhole" problem does not apply to virtualization.

Why do you think so?

Users believed they could "place multiple services on the same machine separated by the hypervisor" would be more secure, and then it turned out they couldn't. If only they had used a simpler system they would have had actual security. This is evident.

> It's about keeping web server bugs in the web server, and not have them infect the mail server.

Wrong. A bug in the Linux PCI code affects web servers. A bug in ext4 affects a mail server. Don't you get it? Big programs have bugs.

_pmf_ · on Oct 13, 2016

> But it does help as an additional layer of security.

You don't add security; you multiply the insecurity (think of traditional component reliability calculation; a system that consists of one component with 50% reliability and another component with 50% reliability is not 100% reliable, but 25%).

bonzini · on Oct 13, 2016

You can also think of virtualization in terms of redundancy. If you can break out of userspace and into the kernel, you now have to break out of the hypervisor (which has a smaller attack surface) in order to escape a very constrained environment. And by hypervisor I really mean the kernel module, because the device emulation is also running heavily constrained through pledge/seccomp/SELinux/whatever.

mtgx · on Oct 13, 2016

So Theo is just like Linus, except for the part where he cares about security.

_pmf_ · on Oct 13, 2016

> and in response to an assertion that the use of VMs would increase security

That was his key point. He wanted to destroy the myth that isolation via VM will automatically make it safe to share a (physical) machine with a malicious attacker that also has a VM on the machine.

yellowapple · on Oct 15, 2016

The key point about that rant is not that virtualization itself is terrible, but rather that the implementations of virtualization are terrible. OpenBSD introducing vmm is consistent with his earlier remarks for that reason, since OpenBSD has a track record of not being terrible.

Hopefully vmm will grow said track record.

ysleepy · on Oct 13, 2016

Surprised that it's not a port of bhyve.

I also found some slides from the authors of vmm/vmd:

http://bhyvecon.org/bhyvecon2016-Mike.pdf

http://bhyvecon.org/bhyvecon2016-Reyk.pdf

(found on https://wiki.freebsd.org/bhyve)

notaplumber · on Oct 13, 2016

bhyvecon was held at the same time as AsiaBSDcon, Mike Larkin & Reyk Flöter were invited to speak. I think OpenBSD vmm/vmd was announced around the same time, so it became an informal "BSD virtualization" event.

Mike's original announcement of virtualization support answered the 'why' regarding porting another hypervisor, but honestly for something as security critical as this, I'm glad they took their time to design something for OpenBSD's needs.

https://marc.info/?l=openbsd-tech&m=144104398132541&w=2

ams6110 · on Oct 13, 2016

OpenBSD and FreeBSD diverged long enough ago that a port might not be straightforward, or in alignment with the objectives of the project.

derefr · on Oct 13, 2016

Bigger things have been ported across wider gaps. Illumos has a "port" of Linux's KVM!

4ad · on Oct 13, 2016

Yeah, and the KVM in illumos comes from a now obsolete Linux, and the pf in FreeBSD and Solaris comes from a now obsolete OpenBSD.

If anything, this should teach us not to port unportable things from different operating systems, because the maintenance cost is untenable.

phessler · on Oct 13, 2016

Solaris' port of PF is not out of date. _And_ they are working with upstream to merge their changes so both parties have a) better code, and b) less maintenance burden.

allanjude · on Oct 14, 2016

iianm, Solaris has a port of IPF, not pf. Oracle is working on trying to port pf now.

4ad · on Oct 15, 2016

The pf has been ported and it shipped already in Solaris 11.3. It is based on OpenBSD 5.5 code.

ams6110 · on Oct 13, 2016

From the thread: Currently this is limited to Intel hosts. We would like to get AMD also supported, but that requires some more work.

Still very cool -- happy to have an option other than qemu.

sigjuice · on Oct 13, 2016

Will this run nested inside VMWare Fusion? I don't have a spare computer to try this out.

phessler · on Oct 13, 2016

if VMWare Fusion provides the Virtualization features to the guest, then yes. If not, no.

But then again, that is true for all VM-nested software.

justincormack · on Oct 13, 2016

Probably, if you enable nested virtualisation it runs most things, eg nested hyper-v, nested xhyve.

dijit · on Oct 13, 2016

typically not, once you use a virtualisation instruction on the CPU it cannot be passed to the guests.

Nested virtualisation typically uses emulation without hardware acceleration.

mrweasel · on Oct 13, 2016

I don't see why you couldn't parse the virtualisation instruction to a virtual machine. Modern hyper-visors don't emulate the CPU instructions any more, they're parsed through to the virtual machine.

While I'm certainly not sure, I do believe that at least some virtualisation solutions are able to parse the virtualisation instructions from the host CPU to the guest.

bonzini · on Oct 13, 2016

You only need to emulate the virtualization instructions. KVM does this and pretty much everyone else does too.

wila · on Oct 13, 2016

Hyper-V runs under Fusion, Qubes OS and Xen also work. Have not tested others, but this is one area where Fusion excels. Yes it will be slower, but for testing the concept that is usually fine.

sigjuice · on Oct 13, 2016

I needed Xen for some experimentation. I had trouble running alpine-xen-3.4.4 in Fusion. XenServer 7 seems to work in Fusion, but I don't like it. I finally managed to get alpine-xen running inside KVM (Ubuntu 16.04) inside Fusion.