Microarchitecture

Because the w11a microarchitecture is very similar to the original 11/70 processor, the KB11-C CPU, the instruction timing in clock cycles is also very similar. A register-register operation takes two clock cycles, more a involved case like an "add @r1,a(r2)" for example takes 12 cycles. Notable exceptions are the MUL (5 instead of 22 cycles) and DIV (23 instead of 46 cycles) instructions.

Clock Rate

On Spartan/Artix classs FPGA's the w11a systems run with at a clock frequency of at least 50 MHz. Specifically:

  FPGA          Board       System        Clock    Comment
  xc7a35t-1     Cmod A7     w11a_c7       80 MHz   uses MMCM (Vivado 2017.1)
  xc7a35t-1     Arty        w11a_br_arty  80 MHz   uses MMCM (Vivado 2016.4)
  xc7a35t-1     Basys3      w11a_b3       80 MHz   uses MMCM (Vivado 2016.4)
  xc7a100t-1    Nexys4 DDR  w11a_br_n4d   80 MHz   uses MMCM (Vivado 2016.4)
  xc7a100t-1    Nexys4      w11a_n4       80 MHz   uses MMCM (Vivado 2016.4)
  xc6slx16-2    Nexys3      w11a_n3       64 MHz   uses DCM  (ISE 14.7)
  xc3s1200e-4   Nexys2      w11a_n2       52 MHz   uses DCM
  xc3s1000-4    S3board     w11a_s3       50 MHz   no DCM

Expected Performance

Compared to KB11-C CPU: The KB11-C CPU had a 150 ns micro cycle time. Both the w11a and the 11/70 have a cache, which greatly reduces the impact of memory latencies. So one expects that the w11a is about a factor 50/6.7 or 7.5 faster than the original PDP-11/70.

Compared to J11 CPU: This later ASIC implementation of the 11/70 ran with up to 20 MHz clock rate. It needed 4 clocks per microcycle, resulting in a 200 ns micro cycle time. However, the J11 had a significantly improved microarchitecture yielding an up to a factor two better cpi (cycles-per-instruction) value. So one expects that the w11a is at least a factor (50/20)*(4/1)*(1/2) or 5 faster than the fastest J11 based system, the PDP-11/93.

Benchmarks

The Dhrystone 2 and Tower of Hanoi benchmark codes taken from the 'BYTE UNIX Benchmark' archived on
2014-07-06
were used to compare the w11a with real PDP-11's and other processors. The w11a values were determined for both boards, the comparison values obtained from Michael Schneider's benchmark collection:

  Type             OS        CPU/cache  (Mhz)   Dhry2   Hanoi   Dhry Hanoi  Dhry
                                                [lps]   [lps]   /MHz  /MHz  /Han

  w11a_s3 V0.5     BSD 2.11  w11a   8k   (50)   11510   160.8    230   3.2  71.6
  w11a_n2 V0.5     BSD 2.11  w11a   8k   (50)   11519   160.4    230   3.2  71.8
  w11a_n2 V0.51    BSD 2.11  w11a   8k   (58)   13218   186.1    228   3.2  71.0
  w11a_n4 V0.73    BSD 2.11  w11a  64k   (80)   18095   250.7    226   3.1  72.2

  pdp-11/53+       BSD 2.11  KDJ11-SD   (4.5)*    828    12.2    184   2.7  67.8
  Mac SE/30        A/UX      68030       (16)    3042    81.8    190   5.1  37.2
  SUN 3/60         NetBSD    68020       (20)    6934   121.3    346   6.1  57.3
  DECstation 2100  NetBSD    R2000       (12)   13206   155.5   1100  13.0  85.2
  NeXT N1100       NetBSD    68040       (25)   26882   386.1   1075  15.4  69.6
  HP 9000/433t     NetBSD    68040       (40)   55763   960.3   1394  24.0  58.1
  NCR system 3230  NetBSD    i486DX/2    (66)   63464   993.1    961  15.0  63.9
  NCR system 3230  NetBSD    i486DX/4   (100)   75010  1022.3    750  10.2  73.4

  Power Mac G4     Gentoo    PPC7455   (1400)    3713k   46.6k  2652  33.3  79.6
  Lenovo TS S10    Gentoo    i686 E8400(3000)   16464k  262.6k  5488  87.5  62.7

Note that the J11 system is listed with the effective microcycle rate of 4.5 MHz rather the chip clock rate of 18 MHz. This is also consistent with Bob Supnik's notes on the J11 were the J11 is classified as '4.5 MHz'. This gives a more meaningful values for the Dhry/MHz or 'Dhrystone per MHz' column. For a fair comparison it is also important to remark that the PDP-11/53+ systems didn't have a cache and were therefore about a factor 2.3 slower than a PDP-11/93 with cache (see comparison archived on
2016-06-18
, explaining the large factor between the w11a_s3 and the 11/53 benchmark results.

The Dhrystone, Tower of Hanoi and 'syscall' benchmarks were also run on a simulated PDP-11 using simh version V3.8-1 and natively on a Linux system. In both cases a Kubuntu 10.4 system with an Intel Core2 Duo E8400 CPU was used, cpufreg was fixed to 3 GHz.

  System       Platform              (MHz)  Dhry2    Hanoi  syscall
                                            [lps]    [lps]    [lps]

  2.11BSD      w11a_s3 V0.5          (50)   11510    160.8     7080
  2.11BSD      w11a_n2 V0.5          (50)   11519    160.4     6888
  2.11BSD      w11a_n2 V0.51         (58)   13218    186.1     7616
  2.11BSD      w11a_n4 V0.73         (80)   18095    250.7    10837

  2.11BSD      simh on Intel E8400   (--)   17174    250.0    10713
  Ubuntu 10.4  Intel E8400         (3000)   10785k    74.1k    1020k

Some observations are: