This time, the experiment is repeated on the same node (btc2 on the DKRZ testystem) and using the exact same kernel but in one case we had KPTI enabled and in the other it was disabled using the debug interface: /sys/kernel/debug/x86/pti_enabled.
Also the overall benchmark suite was not only run 50 times but more than 500 times in each configuration.
Again, 10 processes have been used.
An overview is given in the following table:
|Experiment||Relative speed with KPTI|
So that means that overall a few experiments now run by 3% slower (find, mdtest_hard_read), but easy write is 1% faster.
The following graphs provide boxplots of the indiviual repeated measurements with enabled/disabled KPTI:
|Fig1: IOR measurements|
|Fig2: Metadata measurements|
While there are some outliers in both configurations, the overal picture looks comparable.
The impact of KPTI is neglectable on our sytem for I/O benchmarks as particularly Lustre is significantly slower than using tmpfs, the results for IO-500 are similar with the results when analyzing latency more fine grained as in my previous post about the latency of individual operations.