force re-negotiation of PCIe speed on Linux

6,787

You can check your PCIe energy policy on this file:

# cat /sys/module/pcie_aspm/parameters/policy

Since Gen3 are pretty straightforward on power management through ASPM(Active-State Power Management ), this could be the root cause of the issue on your bus: The throughput is low so, the modules reduces the speed but it forgets to increase it again when needed(if needed). You could enforce on grub to avoid using the "powersave" or the "default" policy by disabling aspm with the following parameter:

pcie_aspm=off

Test this on just one kernel appending this option at /boot/grub/grub.conf on the "kernel" line of your default boot linux. Example of grub config extracted from the Red Hat docs:

default=0 
timeout=10 
splashimage=(hd0,0)/grub/splash.xpm.gz 
hiddenmenu 
title Red Hat Enterprise Linux Server (2.6.18-2.el5PAE)         
root (hd0,0)         
kernel /boot/vmlinuz-2.6.18-2.el5PAE ro root=LABEL=/1 rhgb quiet pcie_aspm=off      
initrd /boot/initrd-2.6.18-2.el5PAE.img
Share:
6,787

Related videos on Youtube

Thomas
Author by

Thomas

Updated on September 18, 2022

Comments

  • Thomas
    Thomas almost 2 years

    I'm working with PCIe Gen 3 cards and from time to time they seem to fall back to PCIe 1 or 2 speeds (according to lspci and also observed by the throughput).

    When rebooting/power cycling the machine the speed goes back to the full PCIe Gen 3 speed in most cases.

    Is there a less intrusive way to force a renegotiation of the PCI link speed (trying to bring it back to PCI Gen 3) on e.g. RHEL6 ?

    • BrettRobi
      BrettRobi almost 12 years
      In windows I've seen it change automatically, it's a function of pci-e bus power-saving. serverfault.com/questions/226319/what-does-pcie-aspm-do seems to have a nice explanation
    • Thomas
      Thomas almost 12 years
      thanks for this information ! Unfortunately, in my case the speed reduction happens on only some of the hosts and it does not go back to full speed when we start using the device again...
    • LawrenceC
      LawrenceC almost 12 years
      Could this be controlled by ACPI? - maybe update or install any ACPI related packages for your distribution. Alternatively maybe you can disable the feature in the BIOS.
  • Thomas
    Thomas about 11 years
    thanks for the answer ! In fact, in the end it turned out that the PCI card had a problem with data transmission on the PCI bus and the manufacturer fixed it with a firmware upgrade.
  • Admin
    Admin about 11 years
    We had simmilar problems with a HBA at our work(brocade), and it was a firmware problem too :)