Next Previous Contents

3. The Linux Benchmarking Toolkit (LBT)

I will propose a basic benchmarking toolkit for Linux. This is a preliminary version of a comprehensive Linux Benchmarking Toolkit, to be expanded and improved. Take it for what it's worth, i.e. as a proposal. If you don't think it is a valid test suite, feel free to email me your critics and I will be glad to make the changes and improve it if I can. Before getting into an argument, however, read this HOWTO and the mentionned references: informed criticism is welcomed, empty criticism is not.

3.1 Rationale

This is just common sense:

  1. It should not take a whole day to run. When it comes to comparative benchmarking (various runs), nobody wants to spend days trying to figure out the fastest setup for a given system. Ideally, the entire benchmark set should take about 15 minutes to complete on an average machine.
  2. All source code for the software used must be freely available on the Net, for obvious reasons.
  3. Benchmarks should provide simple figures reflecting the measured performance.
  4. There should be a mix of synthetic benchmarks and application benchmarks (with separate results, of course).
  5. Each synthetic benchmarks should exercise a particular subsystem to its maximum capacity.
  6. Results of synthetic benchmarks should not be averaged into a single figure of merit (that defeats the whole idea behind synthetic benchmarks, with considerable loss of information).
  7. Applications benchmarks should consist of commonly executed tasks on Linux systems.

3.2 Benchmark selection

I have selected five different benchmark suites, trying as much as possible to avoid overlap in the tests:

  1. Kernel 2.0.0 (default configuration) compilation using gcc.
  2. Whetstone version 10/03/97 (latest version by Roy Longbottom).
  3. xbench-0.2 (with fast execution parameters).
  4. UnixBench benchmarks version 4.01 (partial results).
  5. BYTE Magazine's BYTEmark benchmarks beta release 2 (partial results).

For tests 4 and 5, "(partial results)" means that not all results produced by these benchmarks are considered.

3.3 Test duration

  1. Kernel 2.0.0 compilation: 5 - 30 minutes, depending on the real performance of your system.
  2. Whetstone: 100 seconds.
  3. Xbench-0.2: < 1 hour.
  4. UnixBench benchmarks version 4.01: approx. 15 minutes.
  5. BYTE Magazine's BYTEmark benchmarks: approx. 10 minutes.

3.4 Comments

Kernel 2.0.0 compilation:

Whetstone:

Xbench-0.2:

UnixBench version 4.01:

BYTE Magazine's BYTEmark benchmarks:

3.5 Possible improvements

The ideal benchmark suite would run in a few minutes, with synthetic benchmarks testing every subsystem separately and applications benchmarks providing results for different applications. It would also automatically generate a complete report and eventually email the report to a central database on the Web.

We are not really interested in portability here, but it should at least run on all recent (> 2.0.0) versions and flavours (i386, Alpha, Sparc...) of Linux.

If anybody has any idea about benchmarking network performance in a simple, easy and reliable way, with a short (less than 30 minutes to setup and run) test, please contact me.

3.6 LBT Report Form

Besides the tests, the benchmarking procedure would not be complete without a form describing the setup, so here it is (following the guidelines from comp.benchmarks.faq):


LINUX BENCHMARKING TOOLKIT REPORT FORM


CPU 
== 
Vendor: 
Model: 
Core clock: 
Motherboard vendor: 
Mbd. model: 
Mbd. chipset: 
Bus type: 
Bus clock: 
Cache total: 
Cache type/speed: 
SMP (number of processors): 


RAM 
==== 
Total: 
Type: 
Speed: 


Disk 
==== 
Vendor: 
Model: 
Size: 
Interface: 
Driver/Settings: 


Video board 
=========== 
Vendor: 
Model: 
Bus:
Video RAM type: 
Video RAM total: 
X server vendor: 
X server version: 
X server chipset choice: 
Resolution/vert. refresh rate: 
Color depth: 


Kernel 
===== 
Version: 
Swap size:


gcc 
=== 
Version: 
Options: 
libc version: 


Test notes 
==========


RESULTS 
======== 
Linux kernel 2.0.0 Compilation Time: (minutes and seconds) 
Whetstones: results are in MWIPS. 
Xbench: results are in xstones. 
Unixbench Benchmarks 4.01 system INDEX:  
BYTEmark integer INDEX:
BYTEmark memory INDEX:


Comments* 
========= 
* This field is included for possible interpretations of the results, and as 
such, it is optional. It could be the most significant part of your report, 
though, specially if you are doing comparative benchmarking. 

3.7 Network performance tests

Testing network performance is a challenging task since it involves at least two machines, a server and a client machine, hence twice the time to setup and many more variables to control, etc... On an ethernet network, I guess your best bet would be the ttcp package. (to be expanded)

3.8 SMP tests

SMP tests are another challenge, and any benchmark specifically designed for SMP testing will have a hard time proving itself valid in real-life settings, since algorithms that can take advantage of SMP are hard to come by. It seems later versions of the Linux kernel (> 2.1.30 or around that) will do "fine-grained" multiprocessing, but I have no more information than that for the moment.

According to David Niemi, " ... shell8 [part of the Unixbench 4.01 benchmaks]does a good job at comparing similar hardware/OS in SMP and UP modes."


Next Previous Contents