LSI MegaRAID SAS 9261-8i: Disk isn't recognized after replacement

10,779

As it turns out, this behaviour was caused by a starving HDD - which was the replaced one. I didn't get it because the second server recognized the new HDD without problems, but maybe this was the last breath of this virgin harddrive.

I didn't expected a defect on arrival by data-center grade HDDs (WD RE series, before you ask), I will be aware of it in the future, before I waste hours of my time.

Share:
10,779

Related videos on Youtube

morten.c
Author by

morten.c

Updated on September 18, 2022

Comments

  • morten.c
    morten.c over 1 year

    I've got a Supermicro Server with an LSI MegaRAID SAS 9261-8i Raid Controller inside. There were 3 Disks attached to the controller which were configured as RAID5 array. One of the disks failed recently (RAID displayed as degraded) and after checking the S.M.A.R.T information it came out that it had to be replaced.

    I marked the drive as missing using storcli and removed the drive for ship-in to the vendor. Now the replacement for the disk arrived, I plugged it to the RAID controller but nothing happend. This is what storcli says:

    storcli /c0 show
    
    TOPOLOGY :
    ========
    
    ------------------------------------------------------------------------
    DG Arr Row EID:Slot DID Type  State BT     Size PDC  PI SED DS3  FSpace 
    ------------------------------------------------------------------------
     0 -   -   -        -   RAID5 Dgrd  N  5.456 TB dflt N  N   none Y      
     0 0   -   -        -   RAID5 Dgrd  N  5.456 TB dflt N  N   none Y      
     0 0   0   -        -   DRIVE Msng  -  2.728 TB -    -  -   -    -      
     0 0   1   252:5    14  DRIVE Onln  N  2.728 TB dflt N  N   none -      
     0 0   2   252:2    11  DRIVE Onln  N  2.728 TB dflt N  N   none -      
    ------------------------------------------------------------------------
    

    As you can see, the both drives in Slot 2 and 5 are online and another drive of the Device Group (DG) is marked as missing. The third drive used to be in Slot 0 while the replacement ist now in Slot 1. But the new drive isn't recognized by the controller, as you also can see in the Phsical device list (output from the same command as above):

    Physical Drives = 2
    
    PD LIST :
    =======
    
    -----------------------------------------------------------------------------
    EID:Slt DID State DG     Size Intf Med SED PI SeSz Model                  Sp 
    -----------------------------------------------------------------------------
    252:2    11 Onln   0 2.728 TB SATA HDD N   N  512B WDC WD3000FYYZ-01UL1B0 U  
    252:5    14 Onln   0 2.728 TB SATA HDD N   N  512B WDC WD3000FYYZ-01UL1B0 U  
    -----------------------------------------------------------------------------
    

    In contrast to that, see the following output:

    storcli /c0/pall show
    
    PhyInfo :
    =======
    
    ----------------------------------------------------------------------------
    PhyNo SAS_Addr           Phy_Identifier Link_Speed Device_Type  Description 
    ----------------------------------------------------------------------------
        0 0x0000000000000000              0 No limit   -            -           
        1 0x4433221101000000              0 No limit   End Device   -           
        2 0x0000000000000000              0 No limit   -            -           
        3 0x0000000000000000              0 No limit   -            -           
        4 0x4433221104000000              0 No limit   End Device   -           
        5 0x0000000000000000              0 No limit   -            -           
        6 0x4433221106000000              0 No limit   End Device   -           
        7 0x0000000000000000              0 No limit   -            -           
    ----------------------------------------------------------------------------
    

    I guess that PhyNo 1 is the replaced drive, but this is the only command where I can find a trace of it. All Slot specific commands for Slot 1 ends up with Drive not found.

    Any ideas about that? I tested the replaced drive in a second server which is exactly the same setup (also the same RAID Controller), where the Controller detects the drive instantly marked as UGood which means Unconfigured Good, so it couldn't be a drive error. I also did some reboots, shutdown for a few minutes and tried to use the LSI MegaRaid BIOS while booting up to detect the new drive, without success. The drive doesn't show up in the LSI MegaRaid BIOS boot message.

    Any hints would be much appreciated.

    • Admin
      Admin about 10 years
      Assuming storcli is the same/similar as tw_cli... Have you done storcli /c0 rescan?
    • morten.c
      morten.c about 10 years
      @yoonix Thanks for taking time, but "same/similar" is an almost funny wording for those syntax-breaking crappy Storage Cli tools. There isn't any rescan command I've found. If you want to dig deeper: LSI CLI cross reference is found here, while the official LSI Software User Guide can be found here (warning: loads of zipped crap) Nevermind, I appreciated your quick shot.
  • SvennD
    SvennD almost 8 years
    Remarkable, I had exact the same issue, I have been cursing on my 9271-8i all day... after popping in another new disk it started rebuilding without a fuzz... (it was not even marked as faulty, no red led, just slowly blue led)