Impact of Meltdown kernel updates on Hercules performance
First data (2018-01-14)
The Kernel page-table isolation (KPTI) patches recently introduced to mitigate the Meltdown security vulnerability increases the overhead seen by system calls and will thus impact system performance. I wondered whether that can be seen with Hercules, and indeed there are cases where the instruction timing increases by more than a factor of two !
I ran the benchmark, under
MVS 3.8J with
as included in
in a dual CPU configuration (
NUMCPU=2 MAXCPU=2) before and after
the updates fighting Spectre/Meltdown were installed.
TS tests in the
lock missed configuration show a clear effect, times are up by
more than a factor two, all other tests stay the same within measurement
precision. See the test reports
Tag Comment : before after T292 LR;CS R,R,m (ne) : 333.92 726.15 T297 LR;CDS R,R,m (ne) : 334.79 742.46 T621 MVI;TS m (ones) : 342.58 729.77
As said, all other instruction times are essentially unchanged.
What happened is easy to explain.
if (sysblk.cpus > 1) sched_yield();to get spin locks in the lock missed case efficiently handled. That's why the lock missed case shows a substantially slower instruction time than the lock taken case (which takes only about 80-90 usec). So this test is essentially a system call benchmark, thus very sensitive to the KPTI patch.
Really nice to see this with such clarity. The practical impact for normal code is likely negligible though, that's why I resisted the temptation to title the thread 'Hercules a factor 2 slower' :).
More data and analysis (2018-01-28)
The Meltdown vulnerability is caused by a combination of
- out-of-order execution
- speculative execution
- sub-optimal handling of L1 cache and TLB
- which leads to delayed exceptions
- which allow a side-channel attack
The key culprit are the delayed exceptions. This is a feature of the concrete implementation of a processor architecture, not of a processor architecture itself. Therefore for example Intel has this unfortunate feature, while AMD claims it has not.
Vulnerable is the host CPU and of course not an emulated CPU. The side channel attack requires good time resolution, so it's imho unlikely that System/390 code executed by Hercules can be either source or target of an attack.
What one sees is only the performance impact coming from the mitigation action. The Kernel page-table isolation (KPTI) patches rolled out by all OS vendors slow down system calls, the amount depends on CPU generation and OS version. Newer Intel CPUs, Haswell or later, support Process Context Identifiers (PCID), and newer Kernels, like Linux 4.14.11 and later, can use this to reduce the performance impact of KPTI. In general older CPUs with older OS versions will take a bigger performance hit than newer CPUs with newer Kernel versions.
The text case
sys1 shown in the last posting was generated on
- Intel(R) Core(TM)2 Duo CPU E8400
- Ubuntu 16.04 LTS with a 4.4.0 Linux Kernel
- Intel(R) Core(TM) i5 CPU M520
- Ubuntu 14.04 LTS with a 3.13.0 Linux Kernel
- VitualBox 5.0.12 r104815
- Windows 7
The test reports are under
In this case one gets (instruction times in ns)
Tag Comment : before after T292 LR;CS R,R,m (ne) : 2291.28 3854.92 T297 LR;CDS R,R,m (ne) : 2295.46 3831.74 T621 MVI;TS m (ones) : 2320.39 3812.82
Comparing both systems with s370_perf_sum gives
Tag Comment : sys1-a sys1-b nbk2-a nbk2-b T100 LR R,R : 3.07 3.06 3.53 3.56 T101 LA R,n : 3.91 3.90 4.07 4.09 T102 L R,m : 12.81 12.80 11.86 11.90 T110 ST R,m : 12.79 12.79 12.32 12.23 ... T292 LR;CS R,R,m (ne) : 333.92 726.15 2291.28 3854.92 T297 LR;CDS R,R,m (ne) : 334.79 742.46 2295.46 3831.74 T621 MVI;TS m (ones) : 342.58 729.77 2320.39 3812.82
- simple instructions, like
ST, have very similar speed on both systems.
- lock misses are apparently more costly in a Linux under VitualBox under
Windows environment. Not too astonishing, most likely all three layers
get into action to process the
- the relative KPTI patch impact is smaller on the nbk2 system, which is slow anyway. So hard to judge what's behind this.
Both systems fall likely in the 'old CPU' plus 'old Kernel' category and thus show the worst case impact of the KPTI kernel patches.