Infiniband port status UP but can't open UMAD port ((null):0)

10,846

Solution 1

Does the corresponding umad device file exist (this is typically /dev/infiniband/umad0) ?

Also, on the system I have access to, permissions of /dev/infiniband/umad0 are set by default such that normal users can't access them:

crw-rw---- 1 root root 231, 0 Feb  1 16:00 /dev/infiniband/umad0

so you could use sudo to run your command (or relax the permissions of /dev/infiniband/umad0).

Solution 2

It maybe just a typo here on SO, but you are specifying LID as 10x22. As LID is supposed to be a hexadecimal number, the 1 is extraneous. It should be just a 0x22.

Share:
10,846

Related videos on Youtube

Sidjana
Author by

Sidjana

Updated on September 18, 2022

Comments

  • Sidjana
    Sidjana almost 2 years

    My system has 2 infiniband devices, one of which has both the ports up.

    $> ibstatus
      Infiniband device 'mlx4_0' port 1 status:
             default gid:     fe80:0000:0000:0000:0002:c903:000f:0a9f
              base lid:        0x22
              sm lid:          0x1
              state:           4: ACTIVE
              phys state:      5: LinkUp
              rate:            20 Gb/sec (4X DDR)
              link_layer:      IB
    
      Infiniband device 'mlx4_0' port 2 status:
              default gid:     fe80:0000:0000:0000:0002:c903:000f:0aa0
              base lid:        0x23
              sm lid:          0x1
              state:           4: ACTIVE
              phys state:      5: LinkUp
              rate:            20 Gb/sec (4X DDR)
              link_layer:      IB
    
      Infiniband device 'mlx4_1' port 1 status:
              default gid:     fe80:0000:0000:0000:0002:c903:000f:0a6b
              base lid:        0x0
              sm lid:          0x0
              state:           1: DOWN
              phys state:      2: Polling
              rate:            10 Gb/sec (4X)
              link_layer:      IB
    
      Infiniband device 'mlx4_1' port 2 status:
              default gid:     fe80:0000:0000:0000:0002:c903:000f:0a6c
              base lid:        0xd
              sm lid:          0x2
              state:           4: ACTIVE
              phys state:      5: LinkUp
              rate:            10 Gb/sec (4X)
              link_layer:      IB
    

    Now, when I check the ib port state by lid,

     $> ibportstate  -L 10x22 enable
     ibwarn: [14836] mad_rpc_open_port: can't open UMAD port ((null):0)
     ibportstate: iberror: failed: Failed to open '(null)' port '0'
    

    I am not sure about the reason for this error message. Am I missing something?

    • Khayam Gondal
      Khayam Gondal about 6 years
      Just put sudo before ibportstate.