Impact of Meltdown kernel updates on Hercules performance

First data (2018-01-14)

The Kernel page-table isolation (KPTI) patches recently introduced to mitigate the Meltdown security vulnerability increases the overhead seen by system calls and will thus impact system performance. I wondered whether that can be seen with Hercules, and indeed there are cases where the instruction timing increases by more than a factor of two !

I used the s370_perf instruction time benchmark, now available as GitHub project wfjm/s370_perf.

I ran the benchmark, under MVS 3.8J with Hercules as included in tk4-, in a dual CPU configuration (NUMCPU=2 MAXCPU=2) before and after the updates fighting Spectre/Meltdown were installed. The CS, CDS, and TS tests in the lock missed configuration show a clear effect, times are up by more than a factor of two, and all other tests stay the same within measurement precision. See the test reports

and inspect tests T292, T297, and T621. Summarized
Tag   Comment                :    before     after
T292  LR;CS R,R,m (ne)       :    333.92    726.15
T297  LR;CDS R,R,m (ne)      :    334.79    742.46
T621  MVI;TS m (ones)        :    342.58    729.77

As said, all other instruction times are essentially unchanged. What happened is easy to explain. The CS, CDS, and TS emulation code contains

  if (sysblk.cpus > 1)  sched_yield();
to get spinlocks in the lock missed case efficiently handled. That's why the lock missed case shows a substantially slower instruction time than the lock taken case (which takes only about 80-90 µs). So this test is essentially a system call benchmark, thus very sensitive to the KPTI patch.

Really nice to see this with such clarity. The practical impact for normal code is likely negligible though, that's why I resisted the temptation to title the thread 'Hercules a factor 2 slower' :).

More data and analysis (2018-01-28)

The Meltdown vulnerability is caused by a combination of

The key culprit is the delayed exceptions. This is a feature of the concrete implementation of the processor architecture, not of a processor architecture itself. Therefore for example Intel has this unfortunate feature, while AMD claims it has not.

Vulnerable is the host CPU and of course not an emulated CPU. The side-channel attack requires good time resolution, so it's imho unlikely that System/390 code executed by Hercules can be either source or target of an attack.

What one sees is only the performance impact coming from the mitigation action. The Kernel page-table isolation (KPTI) patches rolled out by all OS vendors slow down system calls, the amount depends on CPU generation and OS version. Newer Intel CPUs, Haswell or later, support Process Context Identifiers (PCID), and newer Kernels, like Linux 4.14.11 and later, can use this to reduce the performance impact of KPTI. In general older CPUs with older OS versions will take a bigger performance hit than newer CPUs with newer Kernel versions.

The text case sys1 shown in the last posting was generated on

I've done another test case nbk2 on

The test reports are under

In this case one gets (instruction times in ns)

Tag   Comment                :    before     after
T292  LR;CS R,R,m (ne)       :   2291.28   3854.92
T297  LR;CDS R,R,m (ne)      :   2295.46   3831.74
T621  MVI;TS m (ones)        :   2320.39   3812.82

Comparing both systems with s370_perf_sum gives

Tag   Comment                :    sys1-a    sys1-b    nbk2-a    nbk2-b
T100  LR R,R                 :      3.07      3.06      3.53      3.56
T101  LA R,n                 :      3.91      3.90      4.07      4.09
T102  L R,m                  :     12.81     12.80     11.86     11.90
T110  ST R,m                 :     12.79     12.79     12.32     12.23
...
T292  LR;CS R,R,m (ne)       :    333.92    726.15   2291.28   3854.92
T297  LR;CDS R,R,m (ne)      :    334.79    742.46   2295.46   3831.74
T621  MVI;TS m (ones)        :    342.58    729.77   2320.39   3812.82

Observations are

Both systems fall likely in the 'old CPU' plus 'old Kernel' category and thus show the worst-case impact of the KPTI kernel patches.

For original posting to Yahoo! Group - Hercules-390 see topic 82874. Dead link since 2020-12-15: Yahoo! Groups was discontinued by Verizon.