mdadm rebooted, array missing? can't assemble?

18,467

It's yet another bug unique to ubuntu AFAIK.

mdadm --stop /dev/md_d*; mdadm --assemble --scan

Share:
18,467
Jim
Author by

Jim

Updated on September 18, 2022

Comments

  • Jim
    Jim over 1 year

    Not sure what is going on with my Array. I rebooted ubuntu 12.04.1 and got a error on startup that fstab couldn't mount the filesystem UUID I have for my mdadm array. After running a few commands a I found on google for mdadm, I am thoroughly confused...it seems my Array just disappeared? I was running RAID 6...

    mdadm -A /dev/md0 
    mdadm: superblock on /dev/sdl doesn't match others - assembly aborted
    

    cat /proc/sys now doesn't show any drives??

     cat /proc/mdstat 
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
    unused devices: <none>
    

    Can't get any details about Array.

     mdadm --misc --detail /dev/md0
    mdadm: cannot open /dev/md0: No such file or directory
    

    mdadm --examine shows two identical Arrays? one with 29 spares?

    mdadm --examine --scan
    ARRAY /dev/md/0 metadata=1.2 UUID=cbb5f346:fedb78ad:d8f6cdb7:18c42e5a name=raidserver:0
       spares=29
    ARRAY /dev/md/0 metadata=1.2 UUID=cbb5f346:fedb78ad:d8f6cdb7:18c42e5a name=raidserver:0
    

    The drives seem to be all recognized by linux:

    lsscsi

    [0:0:0:0]    disk    ATA      WDC WD1600AAJS-0 58.0  /dev/sda
    [1:0:0:0]    cd/dvd  Slimtype DVD A  DS8A8SH   KP55  /dev/sr0
    [4:0:0:0]    disk    ATA      WDC WD10EALX-009 1H15  /dev/sdb
    [4:0:1:0]    disk    ATA      SAMSUNG HD103UJ  1108  /dev/sdc
    [4:0:2:0]    disk    ATA      SAMSUNG HD103SJ  0001  /dev/sdd
    [4:0:3:0]    disk    ATA      SAMSUNG HD103UJ  1109  /dev/sde
    [4:0:4:0]    disk    ATA      WDC WD10EALX-009 1H15  /dev/sdf
    [4:0:5:0]    disk    ATA      WDC WD10EALX-009 1H15  /dev/sdg
    [4:0:6:0]    disk    ATA      WDC WD10EALX-009 1H15  /dev/sdh
    [4:0:7:0]    disk    ATA      WDC WD10EALX-009 1H15  /dev/sdi
    [7:0:0:0]    disk    ATA      Hitachi HDS72101 A3MA  /dev/sdj
    [7:0:1:0]    disk    ATA      Hitachi HDS72101 A3MA  /dev/sdk
    [7:0:3:0]    disk    ATA      Hitachi HDS72101 A3MA  /dev/sdm
    [7:0:4:0]    disk    ATA      Hitachi HDS72101 A3MA  /dev/sdn
    [7:0:5:0]    disk    ATA      Hitachi HDS72101 A3MA  /dev/sdo
    [7:0:6:0]    disk    ATA      Hitachi HDS72101 A3MA  /dev/sdp
    [7:0:7:0]    disk    ATA      Hitachi HDS72101 A3MA  /dev/sdq
    [7:0:8:0]    disk    ATA      Hitachi HDS72101 A3MA  /dev/sdl
    

    Interesting stuff in syslogs:

    Alot of these statements:

    udevd[5505]: inotify_add_watch(6, /dev/dm-23, 10) failed: No such file or directory
    

    and these:

    kernel: [  772.338609] device-mapper: table: 252:23: linear: dm-linear: Device lookup failed
     kernel: [  772.339496] device-mapper: ioctl: error adding target to table
    

    parted

    sudo parted /dev/sdl print
    Model: ATA Hitachi HDS72101 (scsi)
    Disk /dev/sdl: 1000GB
    Sector size (logical/physical): 512B/512B
    Partition Table: msdos
    
    Number  Start   End     Size    Type     File system  Flags
     1      1049kB  1000GB  1000GB  primary               raid
    

    dmesg

    [  147.847979] device-mapper: table: 252:19: multipath: error getting device
    [  147.848261] device-mapper: ioctl: error adding target to table
    [  147.848656] device-mapper: table: 252:19: multipath: error getting device
    [  147.848909] device-mapper: ioctl: error adding target to table
    [  147.862100] device-mapper: table: 252:20: multipath: error getting device
    [  147.862391] device-mapper: ioctl: error adding target to table
    [  147.862823] device-mapper: table: 252:20: multipath: error getting device
    [  147.863094] device-mapper: ioctl: error adding target to table
    [  147.871082] device-mapper: table: 252:20: multipath: error getting device
    [  147.871381] device-mapper: ioctl: error adding target to table
    [  147.871850] device-mapper: table: 252:20: multipath: error getting device
    [  147.872177] device-mapper: ioctl: error adding target to table
    [  147.881409] device-mapper: table: 252:20: multipath: error getting device
    [  147.881677] device-mapper: ioctl: error adding target to table
    [  147.882058] device-mapper: table: 252:20: multipath: error getting device
    [  147.882315] device-mapper: ioctl: error adding target to table
    [  147.885279] device-mapper: table: 252:20: multipath: error getting device
    [  147.885511] device-mapper: ioctl: error adding target to table
    [  147.885855] device-mapper: table: 252:20: multipath: error getting device
    [  147.886081] device-mapper: ioctl: error adding target to table
    [  147.890688] device-mapper: table: 252:20: multipath: error getting device
    [  147.890941] device-mapper: ioctl: error adding target to table
    [  147.891306] device-mapper: table: 252:20: multipath: error getting device
    [  147.891537] device-mapper: ioctl: error adding target to table
    [  147.901351] device-mapper: table: 252:20: multipath: error getting device
    [  147.901632] device-mapper: ioctl: error adding target to table
    [  147.902012] device-mapper: table: 252:20: multipath: error getting device
    [  147.902246] device-mapper: ioctl: error adding target to table
    [  164.749216] device-mapper: table: 252:20: multipath: error getting device
    [  164.749228] device-mapper: ioctl: error adding target to table
    [  164.749785] device-mapper: table: 252:20: multipath: error getting device
    [  164.749794] device-mapper: ioctl: error adding target to table
    [  165.035078] device-mapper: table: 252:20: multipath: error getting device
    [  165.035091] device-mapper: ioctl: error adding target to table
    [  165.035595] device-mapper: table: 252:20: multipath: error getting device
    [  165.035608] device-mapper: ioctl: error adding target to table
    [  165.112537] device-mapper: table: 252:20: multipath: error getting device
    [  165.112553] device-mapper: ioctl: error adding target to table
    [  165.113102] device-mapper: table: 252:20: multipath: error getting device
    [  165.113117] device-mapper: ioctl: error adding target to table
    [  165.113276] device-mapper: table: 252:21: multipath: error getting device
    [  165.113287] device-mapper: ioctl: error adding target to table
    [  165.113996] device-mapper: table: 252:20: multipath: error getting device
    [  165.114006] device-mapper: ioctl: error adding target to table
    [  165.115092] device-mapper: table: 252:20: multipath: error getting device
    [  165.115104] device-mapper: ioctl: error adding target to table
    [  165.116152] device-mapper: table: 252:20: multipath: error getting device
    [  165.116164] device-mapper: ioctl: error adding target to table
    [  165.179138] device-mapper: table: 252:20: multipath: error getting device
    [  165.179152] device-mapper: ioctl: error adding target to table
    [  165.179574] device-mapper: table: 252:20: multipath: error getting device
    [  165.179583] device-mapper: ioctl: error adding target to table
    [  295.287956] iscsi_trgt: Removing all connections, sessions and targets
    [  461.917637] device-mapper: table: 252:21: multipath: error getting device
    [  461.918431] device-mapper: ioctl: error adding target to table
    [  461.919361] device-mapper: table: 252:21: multipath: error getting device
    [  461.920170] device-mapper: ioctl: error adding target to table
    [  462.020231] device-mapper: table: 252:21: multipath: error getting device
    [  462.021212] device-mapper: ioctl: error adding target to table
    [  462.022249] device-mapper: table: 252:21: multipath: error getting device
    [  462.022958] device-mapper: ioctl: error adding target to table
    [  462.063060] device-mapper: table: 252:21: multipath: error getting device
    [  462.063839] device-mapper: ioctl: error adding target to table
    [  462.232766] device-mapper: table: 252:22: multipath: error getting device
    [  462.233553] device-mapper: ioctl: error adding target to table
    [  462.235034] device-mapper: table: 252:23: multipath: error getting device
    [  462.235055] device-mapper: table: 252:22: multipath: error getting device
    [  462.235062] device-mapper: ioctl: error adding target to table
    [  462.236780] device-mapper: ioctl: error adding target to table
    [  462.238371] device-mapper: table: 252:22: multipath: error getting device
    [  462.239094] device-mapper: ioctl: error adding target to table
    [  517.869635] md: md0 stopped.
    [  517.869648] md: unbind<dm-17>
    [  517.928136] md: export_rdev(dm-17)
    [  517.928155] md: unbind<dm-16>
    [  517.952231] md: export_rdev(dm-16)
    [  517.952249] md: unbind<dm-13>
    [  517.952415] md: export_rdev(dm-13)
    [  517.952434] md: unbind<dm-11>
    [  517.960253] md: export_rdev(dm-11)
    [  517.960271] md: unbind<dm-7>
    [  517.968217] md: export_rdev(dm-7)
    [  517.968235] md: unbind<dm-10>
    [  517.980237] md: export_rdev(dm-10)
    [  517.980255] md: unbind<dm-5>
    [  517.980423] md: export_rdev(dm-5)
    [  517.980442] md: unbind<dm-4>
    [  517.992238] md: export_rdev(dm-4)
    [  517.992255] md: unbind<dm-3>
    [  518.008230] md: export_rdev(dm-3)
    [  518.008248] md: unbind<sdj>
    [  518.008416] md: export_rdev(sdj)
    [  518.008522] md: unbind<sdl>
    [  518.076118] md: export_rdev(sdl)
    [  518.076279] md: unbind<sdn>
    [  518.076382] md: export_rdev(sdn)
    [  518.076486] md: unbind<sdo>
    [  518.092235] md: export_rdev(sdo)
    [  518.092394] md: unbind<sdq>
    [  518.092513] md: export_rdev(sdq)
    [  518.092610] md: unbind<sdm>
    [  518.104242] md: export_rdev(sdm)
    [  518.104399] md: unbind<sdk>
    [  518.104519] md: export_rdev(sdk)
    [  561.888200] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
    [  561.888964] sr 1:0:0:0: CDB: Get event status notification: 4a 01 00 00 10 00 00 00 08 00
    [  561.888988] ata2.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in
    [  561.888991]          res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
    [  561.891575] ata2.00: status: { DRDY }
    [  561.893111] ata2: hard resetting link
    [  562.384196] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
    [  562.388535] ata2.00: configured for UDMA/100
    [  562.389721] ata2: EH complete
    [  708.064178] ata2.00: exception Emask 0x10 SAct 0x0 SErr 0x400000 action 0x6 frozen
    [  708.064991] ata2.00: irq_stat 0x08000000, interface fatal error
    [  708.066304] ata2: SError: { Handshk }
    [  708.067952] sr 1:0:0:0: CDB: Get event status notification: 4a 01 00 00 10 00 00 00 08 00
    [  708.067975] ata2.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in
    [  708.067978]          res 50/00:03:00:08:00/00:00:00:00:00/a0 Emask 0x10 (ATA bus error)
    [  708.071318] ata2.00: status: { DRDY }
    [  708.072954] ata2: hard resetting link
    [  709.012196] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
    [  709.014010] ata2.00: configured for UDMA/100
    [  709.026171] ata2: EH complete
    [  772.195090] md: bind<dm-22>
    [  772.338609] device-mapper: table: 252:23: linear: dm-linear: Device lookup failed
    [  772.339496] device-mapper: ioctl: error adding target to table
    [  772.457258] device-mapper: table: 252:23: linear: dm-linear: Device lookup failed
    [  772.458197] device-mapper: ioctl: error adding target to table
    [  772.718699] md: bind<dm-23>
    [  772.728756] device-mapper: table: 252:24: linear: dm-linear: Device lookup failed
    [  772.729199] device-mapper: ioctl: error adding target to table
    [  772.765079] device-mapper: table: 252:25: linear: dm-linear: Device lookup failed
    [  772.766221] device-mapper: ioctl: error adding target to table
    [  772.836592] md: bind<dm-24>
    [  772.847514] device-mapper: table: 252:26: linear: dm-linear: Device lookup failed
    [  772.848413] device-mapper: ioctl: error adding target to table
    [  772.888508] device-mapper: table: 252:26: linear: dm-linear: Device lookup failed
    [  772.889366] device-mapper: ioctl: error adding target to table
    [  772.899526] md: bind<dm-25>
    [  772.911046] device-mapper: table: 252:26: linear: dm-linear: Device lookup failed
    [  772.911914] device-mapper: ioctl: error adding target to table
    [  772.951896] device-mapper: table: 252:26: linear: dm-linear: Device lookup failed
    [  772.952811] device-mapper: ioctl: error adding target to table
    [  780.850451] mpt2sas1: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
    [  782.856161] program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
    [  782.856193] program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
    [  782.856558] sd 7:0:2:0: [sdl] Synchronizing SCSI cache
    [  782.856631] sd 7:0:2:0: [sdl]  Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
    [  782.857335] mpt2sas1: removing handle(0x000a), sas_addr(0x4433221101000000)
    [  800.881141] scsi 7:0:8:0: Direct-Access     ATA      Hitachi HDS72101 A3MA PQ: 0 ANSI: 5
    [  800.881159] scsi 7:0:8:0: SATA: handle(0x000a), sas_addr(0x4433221101000000), phy(1), device_name(0xcca350005dc45ddf)
    [  800.881168] scsi 7:0:8:0: SATA: enclosure_logical_id(0x500605b004d1ecc0), slot(1)
    [  800.881264] scsi 7:0:8:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y)
    [  800.881274] scsi 7:0:8:0: qdepth(32), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1)
    [  800.881681] sd 7:0:8:0: Attached scsi generic sg12 type 0
    [  800.882471] sd 7:0:8:0: [sdl] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
    [  801.061796] sd 7:0:8:0: [sdl] Write Protect is off
    [  801.061804] sd 7:0:8:0: [sdl] Mode Sense: 7f 00 00 08
    [  801.063474] sd 7:0:8:0: [sdl] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
    [  801.253191]  sdl: sdl1
    [  801.439645] sd 7:0:8:0: [sdl] Attached SCSI disk
    [  801.507375] md: bind<sdl>
    [  821.824155] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
    [  821.824945] sr 1:0:0:0: CDB: Get event status notification: 4a 01 00 00 10 00 00 00 08 00
    [  821.824969] ata2.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in
    [  821.824972]          res 40/00:03:00:08:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
    [  821.827851] ata2.00: status: { DRDY }
    [  821.829481] ata2: hard resetting link
    [  822.320129] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
    [  822.324413] ata2.00: configured for UDMA/100
    [  822.325691] ata2: EH complete
    [ 1133.856140] ata2: limiting SATA link speed to 1.5 Gbps
    [ 1133.856149] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
    [ 1133.856892] sr 1:0:0:0: CDB: Get event status notification: 4a 01 00 00 10 00 00 00 08 00
    [ 1133.856915] ata2.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in
    [ 1133.856918]          res 40/00:03:00:08:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
    

    mdadm.conf

    cat /etc/mdadm/mdadm.conf
    # mdadm.conf
    #
    # Please refer to mdadm.conf(5) for information about this file.
    #
    
    # by default (built-in), scan all partitions (/proc/partitions) and all
    # containers for MD superblocks. alternatively, specify devices to scan, using
    # wildcards if desired.
    #DEVICE partitions containers
    
    # auto-create devices with Debian standard permissions
    CREATE owner=root group=disk mode=0660 auto=yes
    
    # automatically tag new arrays as belonging to the local system
    HOMEHOST <system>
    
    # instruct the monitoring daemon where to send mail alerts
    MAILADD myemail
    
    # definitions of existing MD arrays
    
    # This file was auto-generated on Thu, 21 Jun 2012 01:11:03 -0400
    # by mkconf $Id$
    
    #definitions of existing MD arrays
    ARRAY /dev/md0 metadata=1.2 UUID=cbb5f346:fedb78ad:d8f6cdb7:18c42e5a name=raidserver:0
    
    root@raidserver#mdadm /dev/md0 --fail /dev/sdl
    mdadm: error opening /dev/md0: No such file or directory
    root@raidserver# mdadm --assemble --scan
    mdadm: superblock on /dev/sdl doesn't match others - assembly aborted
    

    Tried this from another serverfault.com form.

    mdadm --assemble /dev/md0 /dev/sd{b,c,d,e,f,g,h,i,j,k,m,n,o,p,q,l}1
    mdadm: /dev/sdb1 is busy - skipping
    mdadm: /dev/sdc1 is busy - skipping
    mdadm: /dev/sdd1 is busy - skipping
    mdadm: /dev/sde1 is busy - skipping
    mdadm: /dev/sdf1 is busy - skipping
    mdadm: /dev/sdg1 is busy - skipping
    mdadm: /dev/sdh1 is busy - skipping
    mdadm: /dev/sdi1 is busy - skipping
    mdadm: /dev/sdk1 is busy - skipping
    mdadm: /dev/sdn1 is busy - skipping
    mdadm: /dev/sdo1 is busy - skipping
    mdadm: /dev/sdp1 is busy - skipping
    mdadm: /dev/sdq1 is busy - skipping
    mdadm: /dev/md0 assembled from 3 drives - not enough to start the array.
    

    Found this backup snapshot from yesterday..

    cat /etc/mdadm/mdadm_snapshot12202012 
    /dev/md0:
            Version : 1.2
      Creation Time : Thu Jun 21 01:23:41 2012
         Raid Level : raid6
         Array Size : 13674644480 (13041.16 GiB 14002.84 GB)
      Used Dev Size : 976760320 (931.51 GiB 1000.20 GB)
       Raid Devices : 16
      Total Devices : 16
        Persistence : Superblock is persistent
    
        Update Time : Thu Dec 20 10:02:05 2012
              State : clean 
     Active Devices : 16
    Working Devices : 16
     Failed Devices : 0
      Spare Devices : 0
    
             Layout : left-symmetric
         Chunk Size : 512K
    
               Name : raidserver:0  (local to host raidserver)
               UUID : cbb5f346:fedb78ad:d8f6cdb7:18c42e5a
             Events : 7193
    
        Number   Major   Minor   RaidDevice State
           0       8       49        0      active sync   /dev/sdd1
           1       8       65        1      active sync   /dev/sde1
           2       8       81        2      active sync   /dev/sdf1
           3       8       97        3      active sync   /dev/sdg1
           4       8       17        4      active sync   /dev/sdb1
           7       8       33        5      active sync   /dev/sdc1
           6       8      113        6      active sync   /dev/sdh1
           5       8      129        7      active sync   /dev/sdi1
          16      65        1        8      active sync   /dev/sdq1
          18       8      209        9      active sync   /dev/sdn1
          17       8      161       10      active sync   /dev/sdk1
          20       8      225       11      active sync   /dev/sdo1
          19       8      241       12      active sync   /dev/sdp1
          22       8      145       13      active sync   /dev/sdj1
          21       8      193       14      active sync   /dev/sdm1
          23       8      177       15      active sync   /dev/sdl1
    

    Furthermore, I read this post http://ubuntuforums.org/showthread.php?p=12416893#post12416893 and tried to do a mdadm --create using an old snapshot. The RAID now comes up, but I can't get it to mount.

    This is what I did based on the above link/post

    mdadm --misc --zero-superblock /dev/sd{b,c,d,e,f,g,h,i,j,k,m,n,o,p,q,l}1
    

    and

    mdadm --create /dev/md1 --chunk=512K --level=6 --raid-devices=16 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdb1 /dev/sdc1 /dev/sdh1 /dev/sdi1 /dev/sdq1 /dev/sdn1 /dev/sdk1 /dev/sdo1 /dev/sdp1 /dev/sdj1 /dev/sdm1 /dev/sdl1
    

    Thank you all for taking the time to look at this, as this is beyond me.

    -Jim

  • Jim
    Jim over 11 years
    Added. RAID 6 it is.
  • rhasti
    rhasti over 11 years
    this does not look good. you had no spare devices. probably your raid array is gone
  • Jim
    Jim over 11 years
    This doesn't make any sense. Did dmsetup write over all the drives RAID metadata without letting me know?
  • rhasti
    rhasti over 11 years
    well did you try a reboot yet? The thing is if mdadm snapshot is correct it does mean you have a raid array out of 16 physical disks. all in active mode. No spare drives. I cant help you further. Only recommend you to add some spare drives in case you can repair the array.
  • Jim
    Jim over 11 years
    I'm not sure, but I think the problem is because dmraid took over after I ran an apt-get update/upgrade. Hence the /dev/mapper issues in my syslog.
  • Jim
    Jim over 11 years
    How do I claim spare drives with mdadm?
  • rhasti
    rhasti over 11 years
    give reboot a shot. the dmesg output indicates defect disk though.
  • rhasti
    rhasti over 11 years
    mdadm --grow ... check out raid.wiki.kernel.org/index.php/Linux_Raid
  • rhasti
    rhasti over 11 years
    Can you mount now? And what does show cat /proc/mdstat ?
  • Jim
    Jim over 11 years
    After rebooting no I cannot. cat /proc/mdstat shows....cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md1 : active raid6 sdj1[13] sdi1[7] sdk1[10] sdd1[0] sdm1[14] sdn1[9] sdo1[11] sdp1[12] sdc1[5] sdq1[8] sde1[1] sdl1[15] sdf1[2] sdg1[3] sdh1[6] sdb1[4] 13672823808 blocks super 1.2 level 6, 512k chunk, algorithm 2 [16/16] [UUUUUUUUUUUUUUUU]