How to umount /var /usr safely on systemd without reboot

mount systemd rsync reboot kexec

6,339

WARNING: these operations is extremely risky, I give no warranty about it, it is recommended that you understand every single step and apply it at your own risk, if you don't you are probably gonna break your system and I am not liable for any damages arising from its use.

I have cloned all server disks on a local virtual machine on my workstation in order to try the commands without affecting my original server, if you are unsure you can do the same. After a extensive research and several experiments I finally found a solution.

Some systems uses systemd on initrd, after mounting the target root it switch the systemd environment the same way you would do with chroot, all services are shutdown and then restarted on the new environment.

We could take advantage of this feature to change the root again, doing so on a cloned environment on other disk will release the locks the system holds on the older environment, so you could safely umount these points - in fact it mostly will not even be mounted. But this cloned environment should guarantee the correct configuration to bring network up again and give ssh access. If it does not, consider yourself doomed.

It is important that you have a separated partition, once if you chroot to a directory inside other partition instead of a device itself the system can still hold the parent partition's device, for this use case we should consider it is intended it does not block any.

If you have unallocated free space, or you are able to shrink some partitions in order to have space for the cloned environment it is recommended. However if you does not have free storage space you can create a virtual memory mount point if you have enough memory. A network partition is not recommended once the connection can get lost during the procedure.

A virtual memory mount point can be done by tmpfs, but it will cause you to have this memory locked on the older environment such a way you are unable to free it without a reboot or kexec, a better approach is to use zram, besides creating a memory device, it compress the data, so if you write 5G of data, it will in fact use much less than that of memory, but still, you will require enough memory.

For instance, if 7G is enough you can create it as following:

sudo modprobe zram num_devices=4
echo 7G | sudo tee /sys/block/zram0/disksize
sudo mkfs.ext4 -m0 /dev/zram0

After having provisioned the required partitions for the cloned environment here how it is going to work:

mkdir /tmp/sys
sudo mount /dev/zram0 /tmp/sys # or the target device in place of zram0
# mount other separate partitions if required
sudo tar -cpSf - \
    --acls --xattrs --selinux \
    --exclude '/dev/*' \
    --exclude '/run/*' \
    --exclude '/sys/*' \
    --exclude '/proc/*' \
    --exclude '/tmp/*' \
    --exclude '/var/tmp/*' \
    --exclude '/var/run/*' \
    / |
    sudo tar -xvf - \
        --acls --xattrs --selinux \
        -C /tmp/sys

It is recommended to backup your fstab before performing the switch:

sudo cp -a /tmp/sys/etc/fstab /tmp/sys/etc/fstab-
sudo truncate -s0 /tmp/sys/etc/fstab

If you intend to change the partition where some device is a swap memory it is recommended to disable it first:

sudo swapoff -a

Then, to perform the chroot:

sudo mkdir /sysroot
sudo mount --rbind /tmp/sys /sysroot
sudo touch /etc/initrd-release
sudo systemctl --no-block isolate initrd-switch-root
# it will stop all other services (isolate) and call systemctl switch-root /sysroot

Note that it is not required to bind proc and dev as you normally do when performing a chroot, the systemd will do it for you. You may lost connection, if you wait some time and it still be unable to connect you have my condolences, you have been wasted.

Though if you were lucky and be able to connect now you can perform the partition table changes you desire.

It is important to adjust the devices path and uuid (blkid) to match the new ones:

sudo mv /etc/fstab- /etc/fstab
sudo vi /etc/fstab
sudo vi /etc/default/grub

After you have done your changes, you can do the same strategy to switch back the real root device, once done that do not forget to install the bootloader for the new environment in order to make it up again in case of a shutdown:

sudo grub2-install /dev/sda # or the intended bootable disk
sudo grub2-mkconfig -o /etc/grub2.cfg

Once you have switched back to your physical devices, if you were using zram you can release the memory by following:

echo 1 | sudo tee /sys/block/zram0/reset
sudo modprobe -r zram

Remember, do it at your own risk.

6,339

Tiago Pimenta

Updated on September 18, 2022

Comments

Tiago Pimenta over 1 year

I have a Linux server on a VM which the reboot does work as powering off due to misconfiguration of a third part provider. I do not have access to VM configuration.

The person who installed the system did a mess up with the storage and have irresponsibly mounted one point for each directory (/var, /home, /usr, etc...) leading them to be easely starved for some and empty for others.

In order to fix that mess I am reorganizing the mount points, I was able to manage most of them by doing mount --bind / /mnt followed by rsync and then relaunching the process who use them after umount.

The problem is the /var and /usr which is used by systemd init process itself. Would systemd-remount-fs does the trick? How could I permorm that? Would be a simple fstab edit followed by rsync be enough? Will it restart all the services?

I know which points really does need separate partitions for my case, and it is not the case of /var and /usr at all.

The premise is I can not use umount -l as I will have to destroy the partition after remounting the one, and I would like to avoid kexec due to not knowing if it will have the same buggy efect on this misconfigured VM of being unable to bring it up again.

I am planning to have a compressed btrfs partition for /var/log and another btrfs or xfs for /var/lib/docker, and put all others together with the minimum required space as possible once they will be almost static. And in the future I may put them as squashfs together with the root one and mount a overlayfs to make it easy to detect misconfigurations. I would like to be able to do all of this without rebooting, though I don't know I will be able to.
- JdeBP almost 6 years
  
  You seem to be building an entire problem upon a misapprehension that /run is backed by a disc volume.
- Tiago Pimenta almost 6 years
  
  @JdeBP indeed, my mistake, /run is tmpfs, but /var does not, and it locks me.
- doneal24 almost 6 years
  
  Why do you say irresponsible when multiple best-practices documents (e.g, CIS Benchmark) mandate separating /var, /home, /usr, etc, into different file systems? Maybe better planning on disk sizes is necessary (lvm to the rescue) but the concept is sound.
- Tiago Pimenta almost 6 years
  
  @DougO'Neal The amount of usage space vary a lot between them, it would require me to be constanstly remanaging the allocation depending on the server needs, I would like not to bother about some partitions by putting them together, and giving a huge amount for those who really requires. Said once, for instance, the /var/log in my case can increase easely for more than 10gb of usage during the day, and after logrotate compression on the end of the day to free almost all of it again.
- Tiago Pimenta almost 6 years
  
  I also have /var/lib/docker which vary a lot during the building and deploy, even cleaning dangling images up after the process, can you see having a lot of partition is a waste of space in my case.
- doneal24 almost 6 years
  
  @TiagoPimenta Unless you are very tight on disk space you should not be having these problems. And I'd rather not have my systems grind to a halt cause by a full disk because some user filled up the root partition by downloading docker images - I'll put /var/lib/docker in a separate partition always.
- Tiago Pimenta almost 6 years
  
  @DougO'Neal Unfortunately that is the case, it does not depend on me the amount of storage we have available for the project, it is a limited amount and is a pain in the neck to ask for more, putting that way each giga of space is priceless to be wasted on static paritions, and you know ext4 reserves 5% for each, I would save some space if I put all static partitions together, don't you think so? By the way we already have /var/lib/docker separated but it is starving due to this huge amount of spread partitions.
- Tiago Pimenta almost 6 years
  
  I know I could rearrange them in order to reduce the total space for the minimum used for each partition, but a small variation could put my entire system in risk, that would be drastically diminished if they are put together with some available free space for precaution. I am also planning for the far future to put them as squashfs to prevent these problems.
- doneal24 almost 6 years
  
  @TiagoPimenta In most organizations people cost a lot more than disk space. I don't overallocate space when creating VMs but I do give people enough to get their work done. I'm sorry you're in an organization that reverses these priorities. Good luck.
- Tiago Pimenta almost 6 years
  
  Let us continue this discussion in chat.
Tiago Pimenta over 5 years

I am building a tool to help with this: github.com/tiagoapimenta/online-linux-format