Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Bisected: The Unfortunate Reason Linux 4.20 Is Running Slower (phoronix.com)
227 points by zrm on Nov 17, 2018 | hide | past | favorite | 104 comments


In case people don't like the headline style (leaving out the reference to the answer), the answer given at the end of the article is new Spectre mitigations.

We new that speculative-execution mitigations would sometimes reduce performance, and evidently they do. :-(


[flagged]


That number seems a bit high.


This was known. https://lwn.net/Articles/765837/ reports 21% performance impact and the plan was to handle STIBP as an opt-in security feature. I am not sure why that was not furthered in the patches that have actually been committed to Linux, as I did not follow them closely.


Apparently the way to turn this and other mitigations off is the following mouthful:

    pti=off spectre_v2=off l1tf=off nospec_store_bypass_disable no_stf_barrier
Would it make sense to have a single flag to "run insecure but fast" that we can use on pure development machines, test servers and the like? My Intel development server only runs code I choose.


Hey boss, I don't understand why the code works in devel but fails in production?


You're implying (with unnecessary snark) that:

1. Development machines should be configured identically to production machines. (Do you install a GUI and development tools on your production servers?) Occasional differences in behavior between development/production are par for the course, and is why staging environments are commonly used.

2. The mitigations affect the execution result of code likely to be developed/executed at a typical software shop. AFAIU the attacks are timing-based, and won't affect valid code that's not specifically looking to exploit them.


try convincing a corp sec dept in large org to sign off on disabling anywhere.


Not everyone has ‘production’. A mathematician running experiments only locally, for example.


Surely that objection applies to being able to turn off mitigation at all, not to whether the invocation that does so is short or (as currently) long?


:p yes, there will be some of that around.


What's an example of typical application code that will behave differently with the mitigations on or off?


> What's an example of typical application code that will behave differently with the mitigations on or off?

Pretty much any application where there is limited execution time budget. If it takes too long it's considered broken, unusable, undesirable etc. Any realtime-ish stuff.

Ad bidding, video processing, audio processing / messaging, industrial control systems, robotics, general signal processing, logging systems that will now be overwhelmed and drop messages, databases that will experience timeouts and retries, scientific computations that will now take weeks more to run and potentially screw up other projects.

But hey it's a great time for Intel. Hey, psst, over here, I got a faster later gen CPU for you, for an easy price of $999.99 to bring you performance back to where it was last week.


Certainly the speed of it's execution. That could be difference enough to be concerning when shifting from dev to prod.


I don't think you can turn off all mitigations, since the retpoline (https://stackoverflow.com/questions/48089426/what-is-a-retpo...) stuff is generated by the compiler. You'd probably need to recompile the kernel. How much retpoline is responsible for the performance hit, I don't know.

Also the kernel flags, don't block intel microcode "improvements". You have to do this with:

  sudo apt-mark hold intel-microcode
and look for what's currently installed.

Mitigation isn't a fix, it's a bandaid.


It makes sense (to me) and it has been discussed. I don't know what was the conclusion though.


This alphabetti spaghetti of boot flags is getting ridiculous, there seems to be no central list anywhere, just random snippets across the web. Why is there not a Linus rant when you really need one? :-(


To be fair there is a list, although as you say the flags themselves are pretty random. Of course they can't be easily changed because that would break everyone's boot environment.

https://github.com/torvalds/linux/blob/master/Documentation/...


lsmod

modinfo ${somemodule}

man lsmod man modinfo

This is elementary level linux, you should know this.


Intel was supposed to have 20-30% advantage on single thread performance over Amd. Now I guess single or multiple thread AMD is better on Linux. I don't know why it makes me giddy thinking of the coming consumer wars between the 2.


This is how it should be. And this is going to net positive thing for AMD.

If performance takes a hit by about 20% it is going to be noticeable. Next up I think people will start looking to upgrade to a faster CPU just to get back to last week's performance metrics. They might now consider AMD.

If AMD finds this affects them less they should sponsor security researchers writing proof of concept exploits to convince people that that turning off the mitigation is an absolute no-no, so it will for force people to take a performance hit and go shopping for CPUs.


Is there a comparison table of CPU vulnerabilities? I would like to know which CPU has the least vulnerabilities or the best performance after patching.

From what I have read so far is seems that AMD CPUs have had the fewest vulnerabilities/slowdowns? But I can't be sure since I haven't seen a complete comparison (including these new vulnerabilities).


I don't know of any comparison table. There are two family of vulnerabilities at play when talking about Spectre and co:

- The general Spectre family, which affects most (if not all) CPUs built with speculative execution.

- Meltdown and L1TF, which only affects Intel CPUs due to them delaying security checks until after speculation has taken place.

AMDs, ARMs, etc. that use speculative execution are going to be vulnerable to at least some variant of Spectre (there are 4 variants known right now). ARM published a table[0] explaining which of their CPUs are vulnerable to which variants. I'm not aware of any such table for Intel or AMD.

Microsoft published some interesting tables[1] explaining which mitigation protect against which Spectre variant, and under which thread models they operate.

[0]: https://developer.arm.com/support/arm-security-updates/specu...

[1]: https://blogs.technet.microsoft.com/srd/2018/05/21/analysis-... (scroll down for the tables)


Latest test ( 17 November 2018 ) :

"The Spectre/Meltdown Performance Impact On Linux 4.20, Decimating Benchmarks With New STIBP Overhead"

https://www.phoronix.com/scan.php?page=article&item=linux-42...


According to this [1], the coffee lake refresh has hardware mitigation to meltdown variant 3 and variant 5 (which are only a few of all the things).

Would that hardware mitigation reduce this performance loss?

[1]: https://www.anandtech.com/show/13400/intel-9th-gen-core-i9-9...


Is Intel already selling CPUs which don't need these patches?


Not in x86 architecture.


Then in which architecture?


They still sell Itanian, I don’t think Spectre/Meltdown affect that product line.


*Itanium, and they would likely suffer from the same attacks (maybe not due to being a completely different architecture, but being from thw same company, not sure if there isn't crosspolination between the two), but dont think anyone has looked at them due to small market pressense.


I would expect the opposite. Meltdown and Spectre both make use of peeking at implicit state that chips hang on to in order to optimize microcode; the entire goal of Itanium was to offload this optimization from the chip to the compiler.

A little Googling found an analysis from someone more knowledgeable than myself that backs this up: https://secure64.com/not-vulnerable-intel-itanium-secure64-s...

tl;dr Meltdown takes advantage of out-of-order execution, which Itanium simply doesn't have. Spectre makes use of speculative execution, which Itanium only has in a version too limited to support the attack.

I'm not an expert in this field so I can't attest to the credibility of this analysis. The Spectre part in particular sounds a little hand-wavy but it might just be over my head. But it aligns with my intuition, which is that the simpler and more explicit architecture doesn't have as many places where data can accidentally end up.


This is somewhat unfortunate for older-gen hardware. The difference between chugging-along but usable vs 20% reduced performance is crippling.


I was thinking that it might be an overall large boon to Intel due to all the new computers everyone will buy...


I am a little fearful at how hard it will impact my X220i with its Sandy Bridge i3. Hopefully it won't be too bad.


You can always disable the mitigations or even just stay on old kernels.


1.3-1.4x slowdown is a lot more than I expected (I know it's for synthetic benchmarks but still...)

Can someone explain (or link to an article) how a tweak to HT branch prediction heuristic can have such a huge impact on performance?


The impact is big enough that one would suspect the microcode simply disables indirect branch prediction, so you pay a 16-20 cycle penalty per branch. Indirect branches just aren't frequent enough to explain such a regression via say a simple reduction in prediction resources.

I can test it once I get the new firmware.


> Indirect branches just aren't frequent enough

aren't vtable calls / function pointer calls indirect branches ?


Yes, they are (at least when the compiler cannot devirtualize them) - but they make up a fairly small fraction of the total instructions in a typical program - and probably very small in something like cinebench, which also showed a big regression.


> but they make up a fairly small fraction of the total instructions in a typical

but if their cost increases by a large factor... besides, in any large compiled program, the core would certainly be based around some kind of programmable pipeline, and these would generally be implemented like this unless they wrote their own JIT compiler.


Yes - but for their cost to increase by such a large factor, the only obvious thing I can think of is that their prediction is disabled.

I didn't follow your comment about a "programmable pipeline". I don't think many or any of the Phoronix benchmarks are based on a pipeline with indirect branches at their core.


> I don't think many or any of the Phoronix benchmarks are based on a pipeline with indirect branches at their core.

I think a bunch are. e.g. for instance FFMPEG / libavfilter which is basically a node graph set up at runtime. Don't know for cinebench since it's closed source, but Blender present in the benchmarks is also based around a nodal rendering architecture. Stuff like PHP / CGI also heavily depend on function pointers for their behaviours - PHP with its plugin architecture, and CGI where all web requests go through FPs : https://github.com/php/php-src/blob/master/main/fastcgi.c#L8....


Right, I think I understand what you are saying about "pipelined" implementations. Sure, I can believe that at a high level there are some indirect branches to implement some kind of processing pipeline: but you'd have thousands or millions of instructions doing the heavy lifting for each chunk of data that passes though the pipeline, for every branch that you need to take to get to the next stage.

So I still doubt that that indirect branches are "dense" in those benchmarks: it just doesn't make sense since the core work they are doing are highly tuned encode/decode/whatever kernels, even if there is a control layer over top of that using indirect branches.


It is unlikely the branch prediction heuristics that is the problem (that in the Intel's microcode).

The problem is in the mitigations necessary to make it impossible/more difficult to exploit these side channel attacks. And that is costly because memory and needs to be moved around constantly. So that adds a ton of extra overhead whenever a context switch is made.


The lede is literally the last line in the article:

> But why is it slower? More work on f&#!#(# Spectre!



No mention of the fact that AMD seem unaffected?


It's mentioned in two different places in the article.


Indeed, though the headline seems kind of odd to me given that it's specifically an Intel problem as far as I can tell, it'd be nice to have _some_ mention in there (even if people following this can guess anyway).


I don't like how these Scepter mitigations are being rolled out with no cost/benefit analysis. On a client machine, I'd rather just disable Javascript than pay a 30-50% performance penalty for mitigating these vulnerabilities.


A system running a browser probably has other problems, but just take every supercomputer ever or the vast majority of embedded systems.

Running other peoples code (on other peoples machines) is a cloud thing, but they have an iron grip on the kernel and other software.


I would rather wait for reports of dangerous JavaScript exploits in the wild, myself. Considering that no real-world exploits of Spectre and related attacks have actually been reported, the odds that I'll fall victim to such an exploit without hearing about it in plenty of time to mitigate it are in the millions to one.

Instead, everyone has to pay the performance tax proactively, regardless of their individual threat model. That is not how this sort of thing should be handled.


Would you like a nuclear powerplant to be ran with this kind of approach to safety? Given the amount of possible targets and how unreliable people can be at assessing what risk they're exposed to, having mitigation active by default is obviously the only sane answer.


Somehow I doubt nuclear power plants run (unmodified) Linux kernels released in the last 5 years


Exactly... they're running on XP so it's all good


When I went to Sellafield they had Win3.1 on the fuel rod chopping machine in the reprocessing plant (admittedly in '97!)


Psst, every machine on the Internet must now be hardened to conform to nuclear and aerospace industry standards. So sayeth the Priesthood of Infosec. Haven't you heard?


That's probably not enough.


Sorry, but that's my call, not yours. My computer is not a nuclear power plant.


Are there any of those changes you can't disable for your computer? I thought all(?) of them were behind flags, since different hardware needs different sets.


The Linux team also needs to make a call on the kernel they maintain. You can always choose to stick with an older kernel or even patch out the fixes if you like.


Yeah, that scales really well. The answer to every overreaction and questionable policy is always, "Just rebuild your kernel."


I have been meaning to ask about this. How vulnerable are desktops to these exploits?


As has already been said JavaScript, Flash and such are the biggest risk. Browser vendors have tried to mitigate the attacks, but last I heard, only Mozilla's mitigation was actually effective at preventing attacks.

I'd also personally not feel comfortable using a Windows PC, given that running unchecked 3rd-party code is sort of required to do anything with it.


There are exploits written in Javascript that run on a browser.


Is there any way these could be mitigated at the browser level in this case?


Do you expect Grandpa to do that analysis? The mitigations are probably behind KCONFIG flags anyway.


No need to modify build config, this and other spectre mitigations can be disabled in cmdline (spectre_v2=off)


If you could turn off speculative execution completely I bet that would eliminate all the security concerns, but performance would be abysmal. The Linux kernel really can't be a "one size fits all" type of thing --- an environment where no mutually untrusting code is run has very different security requirements from shared cloud hosts, for example. Personally I think it's all a bit overblown, probably due to owners of the latter environments.


Why is so much code being added to a kernel?

<terms of lines of code changed with more than 354 thousand lines of new code added at the end of October when this merge window opened.


Drivers


Someone once had the great idea that drivers should be in the same codebase as the kernel. Crazy, I know.


It doesn’t seem crazy to me that something that runs in kernel mode and uses kernel APIs should be in the kernel codebase. What are the downsides?


Why should drivers necessarily run in the kernel? With appropriate abstraction for interrupts, DMA, etc. much of the work could be done in userland.


That’s true, but it’s AFAICT a separate question.

The question I was responding to was: “accepting that, rightly or wrongly, Linux drivers run in kernel mode, does it make sense for them to be in the source tree?”


The main downside is that a bad/buggy driver can take down the entire system.


That's a problem with the architecture but a reason for having the drivers in the kernel. It means such a case will be handled as a regression and fixed as opposed to the driver just never being available again.


Linux would have beaten Windows if it wasn't for that.


But at what cost? I don't want to run Linux with non-free drivers. That's not the point.


That's cool. On the other hand, I would love to have a stable API for drivers so manufacturers can release their drivers. That has been taken from me, and now Linux will never be a real choice for me without good GPU drivers, as an example.


Big win for AMD!!!


It's times like this when I'm glad I went with a Ryzen chip instead of whatever Intel has on offer. Not only do I get a ton of cores, but now my chip performs better than the competition!

The only real problem I've had has been with soft lockups, which I was able to solve by disabling a power saving feature on the CPU. If AMD doesn't mess things up, they can really catch up to Intel in the next couple years while Intel is working on fixing their issues.


Think I will wait at least 1-2 more years for all this to shake out before I buy a new computer.


You'll get a much better deal anyway if you do that. I never buy the current generation of hardware. Much better value per dollar to buy one or two generations old. Unless you absolutely need bleeding edge performance.


All the security changes, plus 7nm chips should be widely available and will hopefully bring a price decrease.


Does the article not actually mention the reason? (I mean, I know it's STIBP, but it never says that?


it's on page two. Why a short article like this has pages though...


I was going to say because it's Phoronix, and they've traditionally had very heavy advertising, but I decided to load it up and turn off ublock to check because I haven't visited Phoronix in quite a while, and it seemed to have little to no big advertising. Nice!

Now I assume it's because while even if the textual content is quite small, all the images mean there's quite a bit of vertical length to the article. I had to scroll down quite a bit to get to the next page button. The second page is quite small though, so maybe a better page break algorithm in is order. e.g. break at size X unless total length is X*1.25, and if only two pages, try to make them somewhat even in size (erroring on the side of a larger first page).


I saw the social buttons and assumed end of article, very bad design.


me too!



I assume this is not a problem if I disable hyperthreading?


Probably you'll lose more performance.


As always, this depends on your workload.


As of today still most of the software I use seems to get stuck in one core of my CPU when calculating stuff so I'd argue that by enabling HT I'm losing performance.


That's not really clear from the article, but I suspect that performance drop is from multithreaded software. And sure, if you are observing performance degradation with hyper-threading, there's 0 reason to keep it enabled. OpenBSD guys suggest to turn it off after all.


The article claims a 30% perf slowdown with PHP. The majority of PHP workloads are single threaded. That causes me to suspect this impacts single core workloads too.


Yes, on some processors basically it's as if all userspace was recompiled to use retpolines. Enabling STIBP promises that one thread cannot influence the indirect branch predictor's operation when the sibling runs; on those processors the indirect branch predictor is completely disabled, which is an inefficient but valid way to implement STIBP semantics.


That's interesting. 30% is generally the back of the envelope number for what hyperthreads get you in a OoOE core.


[flagged]


If only the kernel went more creative..


I'm just sitting here waiting for all the security fiasco to stop ruining performance for ridiculous unpractical attacks. Just a matter of time, the pressure for better performance in a world where moore's law is getting stuck will eventually cause every sane person to throw these ridiculous patches out of the window.

If the choice is between unsecure but fast to secure and slow, the default should be fast. Let applications who care about side channel deal with it their way, why the fuck should everything be so slow just so that few apps which actually care about side channel will be secure?


And I'm sitting here waiting for the industry to start taking security seriously. These are not ridiculously unpractical attacks. These are attacks that have been hypothesized for years and are only now getting attention because people finally started to create (public) demonstrations. I would be shocked if there were not already non-public exploits based on these vulnerabilities.

The only reason these attacks even have the appearance of being unpractical is because there are so many other areas of our computing systems that are even easier to attack.

The only reason we are in this current Spectre et. al. mess is that CPU vendors choose to prioritize performance over security, and we now have to live with their bluff getting called for another hardware generation or two. I suspect that the performance impact will be far less when the fixes are done during the hardware design process, instead of figuring out how to bolt them on after the fact.

Ultimately, I think we are overdo for a new computer architecture (both in terms of security and performance, we are being encumbered by the need to keep new architectures backwards compatible with old ones).


This is just the mitigations at the kernel level, to protect the kernel from applications and applications from each other. Without this applications can't even attempt to secure themselves from these issues. I think to allow applications a chance to opt into being secure from these attacks without applying the fixes system-wide you'd need a way to disable speculative memory reads per page which would nerf performance even harder for the applications that turned that option on.


These attacks aren’t impractical at all. The very first announcement of spectre and meltdown came with trivial sample code.


I'm sitting here waiting for you to realize "unsecure" is not a word.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: