Finding the memory bandwidth of each core of my processor

12,283

Solution 1

There is a memory bandwidth benchmark available for Linux. It is open source and works for X86 and Arm.

It will give you raw performance for your memory as well as system performance with memory. But it will not give you a real-time bandwidth.

There is also a memtop tool. It is more about usage than bandwidth. You can use it to monitor your system while PETSc runs to see how much bandwidth is used.

There is also a program to read CPU performance counters, which can be used in combination with page faults.

And finally, you can always just try to run PETSc. If performance doubles when using two cores then you had bandwidth to spare. Repeat until speed increases stop. Not the most elegant way, but quite possible the best practical solution.

Solution 2

The normal way to talk about memory bandwidth is using the Stream benchmark, which is available in threaded versions. There is a close relationship between the theoretical bandwidth (number of channels * width * clock) for a given system - this is convenient to know, since it's easy to calculate.

sys    memory                      BW*   stream  stream/core
R      2s x 2ch x PC3200  (numa)   12.8  6.5     3.2
S      2s   2ch x pc5400  (uma)    10.8  6.1     .76
O      2s x 4ch x pc10660 (numa)   85    51      2.1

Modern machines, especially desktops, tend to provide more than these older server systems. Numbers above are all conventionally-compiled, untuned Stream runs - enthusiast sites tend to report Windows-based tuned pseudo-Stream numbers that get closer to the hardware's theoretical values. I would not use values from Memtest86, since it's a ram-pattern tester, not really a benchmark.

Also, in general, numeric codes can profitably use blocking to mitigate their dependence on pure memory bandwidth. the PETSc comment implies that they're not blocking, which is unfortunate, since memory has not scaled with onchip FLOPS.

Share:
12,283

Related videos on Youtube

smilingbuddha
Author by

smilingbuddha

Updated on September 18, 2022

Comments

  • smilingbuddha
    smilingbuddha over 1 year

    Hi I am learning PETSc (a software for solving PDES' numerically in parallel) and I came this passage in the FAQ

    High per-CPU memory performance is required. Each CPU (core in multi-core systems) needs to have its own memory bandwith of roughly 2 or more gigabytes/second. For example, standard dual processor "PC's" will not provide better performance when the second processor is used, that is, you will not see speed-up when you using the second processor. This is because the speed of sparse matrix computations is almost totally determined by the speed of the memory, not the speed of the CPU

    I am using Debian Linux and Ubuntu Linux systems on my computers. How do I find out the memory-bandwidth in Gb/s of my cpu? Are there any linux commands for this.

    • sawdust
      sawdust over 11 years
      For x86 there's Memtest86+. It might even be in your Grub boot menu. It'll report processor cache and memory speeds in MB/s. Presumably that "N * 2 GB/s" minimum would be at the RAM interface and not at the cache.
    • MSalters
      MSalters over 11 years
      Does this really matter today? The memory bandwidth of a single HyperTransport 1.0 link is already 3.2 GB/s @ 800 Mhz.
  • smilingbuddha
    smilingbuddha over 11 years
    thanks for your reply. could you clarify what you mean by the phrases about the Bandwidth program which gives "raw performance for your memory as well as system performance with memory" v/s "real-time bandwidth"? Thanks a lot