How to make RAID controller rescan devices
Restart the out-of-sync controller (eg c1)
/opt/MegaRAID/storcli/storcli64 /c1 restart
Related videos on Youtube
Michael
Updated on September 18, 2022Comments
-
Michael almost 2 years
I have the following setup:
A single server with two LSI MegaRAID SAS 9380-8e controllers which are both connected to two 60-bay disk shelves while roughly following the design by Edmund White (see https://github.com/ewwhite/zfs-ha/wiki). The goal is to replicate the exact setup, but it's currently mid-migration.
After wiring the first shelf, all 60 disks were seen by both controllers and multipathing was setup and works smoothly. When adding the second disk shelf, there was still some old RAID configuration on the 60 disks which was dutifully reported by both controllers. Using the first controller I removed the configuration from disks and set them to being JBOD. All 60 disks are now visible to the OS and could be registered with multipath but only report a single path (going through controller 1), the second controller still reports all 60 disks as foreign (UGood F) and there is seemingly no way to forcibly make the controller rescan the devices or forget the current config for just this shelf:
# /opt/MegaRAID/storcli/storcli64 /c1 /e71 /sall show | head -n20 Controller = 1 Status = Success Description = Show Drive Information Succeeded. Drive Information : ================= ----------------------------------------------------------------------- EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp ----------------------------------------------------------------------- 71:0 74 UGood F 3.637 TB SAS HDD N N 512B HUS724040ALS640 D 71:1 107 UGood F 3.637 TB SAS HDD N N 512B HUS724040ALS640 D 71:2 72 UGood F 3.637 TB SAS HDD N N 512B HUS724040ALS640 D 71:3 95 UGood F 3.637 TB SAS HDD N N 512B HUS724040ALS640 D 71:4 90 UGood F 3.637 TB SAS HDD N N 512B HUS724040ALS640 D 71:5 77 UGood F 3.637 TB SAS HDD N N 512B HUS724040ALS640 D 71:6 73 UGood F 3.637 TB SAS HDD N N 512B HUS724040ALS640 D 71:7 76 UGood F 3.637 TB SAS HDD N N 512B HUS724040ALS640 D 71:8 83 UGood F 3.637 TB SAS HDD N N 512B HUS724040ALS640 D
This is the same shelf as seen by the other controller:
# /opt/MegaRAID/storcli/storcli64 /c0 /e165 /sall show | head -n20 Controller = 0 Status = Success Description = Show Drive Information Succeeded. Drive Information : ================= ----------------------------------------------------------------------- EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp ----------------------------------------------------------------------- 165:0 127 JBOD - 3.637 TB SAS HDD N N 512B HUS724040ALS640 U 165:1 121 JBOD - 3.637 TB SAS HDD N N 512B HUS724040ALS640 U 165:2 118 JBOD - 3.637 TB SAS HDD N N 512B HUS724040ALS640 U 165:3 116 JBOD - 3.637 TB SAS HDD N N 512B HUS724040ALS640 U 165:4 146 JBOD - 3.637 TB SAS HDD N N 512B HUS724040ALS640 U 165:5 122 JBOD - 3.637 TB SAS HDD N N 512B HUS724040ALS640 U 165:6 115 JBOD - 3.637 TB SAS HDD N N 512B HUS724040ALS640 U 165:7 142 JBOD - 3.637 TB SAS HDD N N 512B HUS724040ALS640 U 165:8 145 JBOD - 3.637 TB SAS HDD N N 512B HUS724040ALS640 U
But trying to clear the (wrong) info from the second controller does not work:
# /opt/MegaRAID/storcli/storcli64 /c1 /fall show Controller = 1 Status = Success Description = Couldn't find any foreign Configuration # /opt/MegaRAID/storcli/storcli64 /c1 /fall delete Controller = 1 Status = Success Description = Couldn't find any foreign Configuration # /opt/MegaRAID/storcli/storcli64 /c1 /fall import Controller = 1 Status = Success Description = Couldn't find any foreign Configuration
Forcing the disks into JBOD on the second controller does not work either:
# /opt/MegaRAID/storcli/storcli64 /c1 /e71 /sall set jbod | head -n20 Controller = 1 Status = Failure Description = Set Drive JBOD Failed. Detailed Status : =============== ------------------------------------------------- Drive Status ErrCd ErrMsg ------------------------------------------------- /c1/e71/s0 Failure 255 Operation not allowed. /c1/e71/s1 Failure 255 Operation not allowed. /c1/e71/s2 Failure 255 Operation not allowed. /c1/e71/s3 Failure 255 Operation not allowed. /c1/e71/s4 Failure 255 Operation not allowed. /c1/e71/s5 Failure 255 Operation not allowed. /c1/e71/s6 Failure 255 Operation not allowed. /c1/e71/s7 Failure 255 Operation not allowed. /c1/e71/s8 Failure 255 Operation not allowed. /c1/e71/s9 Failure 255 Operation not allowed.
Is there any way to tell the RAID controller those disks do no longer have a foreign config and should be seen as JBODs?
-
Lenniey almost 7 yearsCould you try
/cx rescan
? -
Lenniey almost 7 yearsSorry, I was on 3Ware...did all your disks come from the same old machine / vendor? Some controllers install their own firmware and can only be used by another one if you low-level-format the disk or remove the config from the old controller. Also I assume the controllers are all on the same firmware / BIOS etc.?
-
ewwhite almost 7 yearsDoes the controller have a JBOD mode? Why aren't you using a SAS HBA for ZFS? Are these a bunch of RAID0 arrays?
-
Michael almost 7 years@eewhite: Yes, there is a JBOD mode (see sample output of controller c0 above). I am migrating from a different setup and had those 4 relatively new RAID controllers around. And it already works well with the first shelf. The problem when adding the second shelf was simply that the controller detected the existing config and pulled it in.
-
Michael almost 7 years@Lenniey: Yes, all disks were connected to a LSI controller (same brand) before. The controller 0 also showed the foreign config at first. I changed the disks to JBOD on this contoller and can access them from the OS. Only now, controller 1 is not updateing it's config to the change.
-
Lenniey almost 7 yearsI'd try disconnecting controller 1 (PCI-wise) + storage, reboot etc., reconnecting and rebooting again. I had so many strange troubles with RAID-controllers of different vendors / HDD incompatibilities and whatever you can think of, that this is ususally my "workflow". Next would be to attach only one single disk, try to format (or initialize) it and see what happens.
-
-
jeffre about 5 yearsI'd like to add that you may want to consult with Broadcom on your configuration, as I was recently informed by their support what you (and I) were doing is not supported: "You cannot have 2 MegaRAID controllers taking charge of the same set of drives on 1 enclosure even if there are 2 SAS expander chips on the backplane."