Forcing zpool to use /dev/disk/by-id in Ubuntu Xenial
Solution 1
I know this thread is sort of stale, but there is an answer. You need to update your cache file after you import. This example shows the default location for the cache file.
$> sudo zpool export POOL
$> sudo zpool import -d /dev/disk/by-id POOL
$> sudo zpool import -c /etc/zfs/zpool.cache
$> sudo zpool status POOL
NAME STATE READ WRITE CKSUM
POOL ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-Hitachi_HDS722020ALA330_[..] ONLINE 0 0 0
ata-Hitachi_HDS722020ALA330_[..] ONLINE 0 0 0
ata-Hitachi_HDS722020ALA330_[..] ONLINE 0 0 0
ata-Hitachi_HUA722020ALA330_[..] ONLINE 0 0 0
Solution 2
One in a while, zpool import -d /dev/disk/by-id
doesn't work.
I've noticed this on more than one environment. I have an import script that, beyond also doing some magic logic and showing physically attached ZFS devices, also does basically this:
zpool import -d /dev/disk/by-id POOL
zpool export POOL
zpool import POOL
The second time around, even without the -d
switch, imports by device ID even if it didn't the first time with the explicit command.
It's possible this was just due to a ZFS bug during a few week or month span of time (a year or two ago), and this is no longer necessary. I suppose I should have filed a bug report, but it was trivial to work around.
Related videos on Youtube
![Ruben Schade](https://i.stack.imgur.com/ULuO1.jpg?s=256&g=1)
Ruben Schade
Long time lurker, but figured I may as well register. I'm a virtualisation ops guy in Sydney, Australia. Cheers.
Updated on September 18, 2022Comments
-
Ruben Schade almost 2 years
I'm giving the bundled OpenZFS on Ubuntu 16.04 Xenial a try.
When creating pools, I always reference drives by their serials in
/dev/disk/by-id/
(or/dev/disk/gpt
on FreeBSD) for resiliency. Drives aren't always in the same order in/dev
when a machine reboots, and if you have other drives in the machine the pool may fail to mount correctly.For example, running
zpool status
on a 14.04 box I get this:NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ata-Hitachi_HDS722020ALA330_[..] ONLINE 0 0 0 ata-Hitachi_HDS722020ALA330_[..] ONLINE 0 0 0 ata-Hitachi_HDS722020ALA330_[..] ONLINE 0 0 0 ata-Hitachi_HUA722020ALA330_[..] ONLINE 0 0 0
But when I create a new pool on 16.04 with this (abbreviated):
zpool create pool raidz \ /dev/disk/by-id/ata-Hitachi_HDS723030ALA640_[..] \ /dev/disk/by-id/ata-Hitachi_HDS723030ALA640_[..] \ /dev/disk/by-id/ata-Hitachi_HDS723030ALA640_[..] \ /dev/disk/by-id/ata-Hitachi_HDS723030ALA640_[..]
I get this with
zpool status
:NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 sdf ONLINE 0 0 0 sde ONLINE 0 0 0 sdd ONLINE 0 0 0 sda ONLINE 0 0 0
It looks like zpool followed the symlinks, rather than referencing them.
Is there a way to force zpool on 16.04 to respect my drive references when creating a pool? Or alternatively, are my misgivings about what its doing here misplaced?
Update: Workaround
I found a thread for zfsonlinux on Github that suggested a workaround. Create your zpool with
/dev/sdX
devices first, then do this:$ sudo zpool export tank $ sudo zpool import -d /dev/disk/by-id -aN
I would still prefer to be able to do this with the initial
zpool create
though if possible.-
Admin about 8 yearsIt doesn't matter how you create them. If it reverts to /dev/sd? device names, the
zfs export
andzfs import -d
will work anyway. BTW, unless you really need every byte of space, use two mirrored pairs rather than raidz. raidz's performance is better than raid-5 but still much worse than raid-10 or zfs mirrored pairs. it's also easier to expand a pool made up of mirrored pairs, just add two disks at a time...with raidz, you have to replace each of the drives with larger drives, and only when you've replaced all of them will your pool have more space available. -
Admin about 8 yearsI still have some raid-z pools, and regret having made them. When i can afford to buy replacement disks, I'll create new pools with mirrored pairs and use
zfs send
to copy my data to the new pools. Actually, raid-z is OK for my mythtv box where performance isn't critical unless i'm running 6 or 8 transcode jobs at once. Changing to mirrored pairs would be very noticeable on the pool where my/home
directory lives. -
Admin about 8 yearsoh, and add a pair of SSDs....partitioned to give a mirrored pair of smallish (4GB or so is plenty)
log
(i.e.ZIL
orZFS Intent Log
) devices, and two large (remainder of the SSDs?), non-mirroredcache
devices forL2ARC
. -
Admin about 8 years@cas Keep in mind that log and cache devices have completely different usage patterns: the first is hit by a large amount of data and needs high endurance/TBW as well as low latency and power-loss protection capacitors, mirroring is optional for safety. The second one needs high read IOPs and mirroring is only useful for availability and not losing the cache (if you don't use Solaris 11 which has permanent L2ARC). I would suggest to split instead of mirror, so you get the best for each use case.
-
Admin about 8 yearsThe mirroring of ZIL is so you can get away with using ordinary cheap SSDs rather than expensive ones with large capacitors to guard against power-loss. IMO, mirroring of the ZIL is not optional, no matter what kind of SSDs you have - if your ZIL dies, you lose all the yet-to-be-written data in it and potentially corrupt your pool. As for L2ARC, i specifically said NOT to mirror them...mirroring the L2ARC cache is a waste of time, money, and good SSD space (and would do nothing to prevent losing the cache - where did you get that idea from?)
-
Admin about 8 yearsA basic Q turned into a meta ZFS discussion hah, but some interesting advice, thanks. I usually use mirrored pairs, but this is a dumb backup HP MicroServer samba target where performance isn't an issue and money is tight. Works just fine.
-
Admin about 8 years:) BTW, my brain wasn't working right when I explained the reason for mirroring ZIL. It's not to guard against power-loss, that's complete nonsense and i should never have said it. It's to guard against failure of the ZIL drive. i.e. raid-1 mirror for the ZIL. Two reasonably-priced SSDs are, in general, better than one extremely expensive one (unless the more expensive SSD has a much faster interface, like PCI-e vs SATA). and a UPS is essential...cheap protection against power-loss.
-
Admin about 8 years@cas Mirrored ZIL protects against SLOG device failure at the same time as an unexpected shutdown. Under normal operations, the ZIL is write-only, and writes to persistent storage is from RAM (ARC). If the system shuts down unexpectedly, the intent log (ZIL, SLOG) is used to finish the writes that were interrupted. Only if the unexpected shut down coincides with failure of a SLOG device do you need redudant SLOG to recover the interrupted writes. For most non-server (and many server) workloads, a SLOG is overkill, as the ZIL really only comes into play with synchronous writes.
-
Admin almost 4 yearsDoes this work on ZFS root pools? I'm using Proxmox and they also use /dev/sda, etc...
-
-
Wouter almost 4 yearsIn your case you got disks named
ata-...
but in my case they're calledwwn-...
. I understand there's a nuance, but I findata-
to be more practical because the serial number is in the name. Can you tell how to switchwwn
intoata
? -
Steve O almost 4 yearsI suppose the status names depend on how you imported the disks. The names in /dev/disk/by-id are assigned by the OS (Ubuntu in my case). How those names are derived is another subject.
-
Wouter almost 4 yearsI read that you can move/delete the wwn* files and just run the command again. This time the ata* names will be discovered and used. Have not tried it yet.
-
Steve O almost 4 yearsHmmm... Sounds pretty sketchy to me. I wouldn't try that on a production environment. I would make a disposable VM for that little experiment