Linux on VMware - why use partitioning?

linux virtualization hard-drive storage partition

16,167

Solution 1

This is an interesting question...

I don't think there's a definitive answer, but I can give some historical context on how best-practices surrounding this topic may have changed over time.

I've had to support thousands of Linux VMs deployed in various forms across VMware environments since 2007. My approach to deployment has evolved, and I've had the unique (sometimes unfortunate) experience of inheriting and refactoring systems built by other engineers.

The old days...

Back in the day (2007), my early VMware systems were partitioned just like my bare metal systems. On the VMware side, I was using split 2GB thick files to comprise the VM's data, and didn't even think about the notion of multiple VMDKs, because I was just happy that virtualization could even work!

Virtual Infrastructure...

By ESX 3.5 and the early ESX/ESXi 4.x releases (2009-2011), I was using Linux, partitioned as normal atop monolithic Thick provisioned VMDK files. Having to preallocate storage forced me to think about Linux design in a similar manner as I would with real hardware. I was creating 36GB, 72GB, 146GB VMDK's for the operating system, partitioning the usual /, /boot, /usr, /var, /tmp, then adding another VMDK for the "data" or "growth" partition (whether that be /home, /opt or something application-specific). Again, the sweet-spot in physical hard disk sizes during this era was 146GB, and since preallocation was a requirement (unless using NFS), I needed to be conservative with space.

The advent of thin provisioning

VMware developed better features around Thin provisioning in later ESXi 4.x releases, and this changed how I began to install new systems. With the full feature set being added in 5.0/5.1, a new type of flexibility allowed more creative designs. Mind you, this was keeping pace with increased capabilities on virtual machines, in terms of how many vCPUS and how much RAM could be committed to individual VMs. More types of servers and applications could be virtualized than in the past. This is right as computing environments were starting to go completely virtual.

LVM is awful...

By the time full hot-add functionality at the VM level was in place and common (2011-2012), I was working with a firm that strove to maintain uptime for their clients' VMs at any cost (stupid). So this included online VMware CPU/RAM increases and risky LVM disk resizing on existing VMDKs. Most Linux systems in this environment were single VMDK setups with ext3 partitions on top of LVM. This was terrible because the LVM layer added complexity and unnecessary risk to operations. Running out of space in /usr, for instance, could result in a chain of bad decisions that eventually meant restoring a system from backups... This was partially process and culture-related, but still...

Partition snobbery...

I took this opportunity to try to change this. I'm a bit of a partition-snob in Linux and feel that filesystems should be separated for monitoring and operational needs. I also dislike LVM, especially with VMware and the ability to do what you're asking about. So I expanded the addition of VMDK files to partitions that could potentially grow. /opt, /var, /home could get their own virtual machine files if needed. And those would be raw disks. Sometimes this was an easier method to expand particular undersized partition on the fly.

Obamacare...

With the onboarding of a very high-profile client, I was tasked with the design of the Linux VM reference template that would be used to create their extremely visible application environment. The security requirements of the application required a unique set of mounts, so worked with the developers to try to cram the non-growth partitions onto one VMDK, and then add separate VMDKs for each mount that had growth potential or had specific requirements (encryption, auditing, etc.) So, in the end, these VMs were comprised of 5 or more VMDKs, but provided the best flexibility for future resizing and protection of data.

What I do today...

Today, my general design for Linux and traditional filesystems is OS on one thin VMDK (partitioned), and discrete VMDKs for anything else. I'll hot-add as necessary. For advanced filesystems like ZFS, it's one VMDK for the OS, and another VMDK that serves as a ZFS zpool and can be resized, carved into additional ZFS filesystems, etc.

Solution 2

When I worked in infrastructure at a particular "large virtualization software company," we often needed to increase the size of a vm's filesystem. We used ext3/4 at the time.

Increasing the virtual disk is very easy, picking up the new device size in a live OS is relatively easy (poke around in /sys), resizing the ext3/4 filesystem live was easy, but what always seemed impossible (to do live) was resizing the partition.

You had to use gparted or rewrite/resize the partition table using fdisk -- but it was always locked by the kernel and required a reboot to get the kernel to pick up the new layout (partprobe didn't do it either.)

I moved many systems to LVM and resizing filesystems became an easy, almost pleasant, experience!

Increase the virtual disk image outside the VM
In the VM,
- Poke /sys to rescan the disk metrics (echo "1" > /sys/class/scsi_device//device/rescan)
- pvresize /dev/sdX (resize the physical volume in LVM)
- lvresize --extents +100%FREE /dev/VG/lvolXX (resize the logical volume in LVM)
- resize2fs (resize the filesystem)

All of this could be done safely on a live system -- and no reboot required!

Why not a bare disk? It makes me nervous - I don't feel that bare disks are widely accepted enough yet, but I think we're on the verge of much wider acceptance. There was a thread on the btrfs mailing list related to this:

http://www.spinics.net/lists/linux-btrfs/msg24730.html

But a bare disk would just need the rescan and resize2fs.

So, in summary, yeah, avoid partition tables if you can.

Solution 3

You're right in many ways, I can see the argument - there is one issue that could prove tricky though. If you use Resource Pools (and I know I don't, hateful things) then VMs can get more IO time if they have more disks - in extreme resource constrained situations a VM with two disks could get twice as much IO resources as one with a single disk. This may well not be an issue to you but I thought I'd point it out.

Edit - oh and it would make snapping slightly slower too, but again that might not be an issue.

Solution 4

there is another option: mount the application data on NFS volumes. You require good filers (not all NFS implementations are the same).

When the NFS volumes fill up, expand the volume, the linux client will see the extra space right away.

Your application and vendor must support having its data on NFS, and you need a careful NAS design but so you do with every storage solution for your virtualized environment.

Another bonus point for this approach is that if your storage vendor has snapshotting/cloning technology (like zfs or Netapp) backing the data up and creating test/dev environments is really easy.

Solution 5

Whether it's better to do this or not depends on your system.

There are pros and cons of each setup.

However, the main advantages of a single drive are as follows:

Simplicity: A single drive has a single file, which can be easily distributed and replicated.
Cues to the Host OS: A single file will be treated as a single block of data, and hence the host OS will know that sequences of guest machine access will all be in that one file. This can be achieved on some host OS configurations by simply placing all the drive images in the same file, but it won't necessarily be the case.

However, there are advantages to multi-drive.

Bare metal affinity / manual location: With a single drive you are locked to a single bare-metal affinity of the drive.
Size limitations: If your system has limits on the size of the drive or on files, you could hit them on very big systems.
Read-Only volumes for security: This is the big advantage. If your master volume for the OS is read only on the VM side it provides major security advantages, essentially locking out the ability of programs inside the VM from editing the base OS of the guest. Using a separate data drive allows you to create read-only drives, that can be booted read-write for maintenance and updates without only cleanroom template data, preventing modification of vital OS directories from within the server altogether.

View more solutions

16,167

savoche

Updated on September 18, 2022

Comments

savoche over 1 year
When installing Linux VMs in a virtualized environment (ESXi in my case), are there any compelling reasons to partition the disks (when using ext4) rather than just adding separate disks for each mount point?

The only one I can see is that it makes it somewhat easier to see if there's data present on a disk with e.g. fdisk.

On the other hand, I can see some good reasons for not using partitions (for other than /boot, obviously).
- Much easier to extend disks. It's just to increase disk size for the VM (typically in VCenter), then rescan the device in VM, and resize the file system online.
- No more issues with aligning partitions with underlying LUNs.
I have not found much on this topic around. Have I missed something important?
- Chopper3 over 9 years
  
  Oh and I just wanted to comment how impressed myself and some of the other 'high-rep' users of SF were with your first question. We get accused sometimes of beating up new guys but it's really just that a lot of new users don't read what we're about and what we're not - so I thought I should just say thanks for asking an appropriate question in a well written and considered way :)
- FooBee over 9 years
  
  I have two remarks though: 1) VMWare is not a product but a company. VMWare ESXi would be a product. 2) I would edit this question to be about virtualized environments in general, as this is equally relevant for e.g. KVM, Xen and HyperV.
- savoche over 9 years
  
  Thanks. And I have edited the wording to be a bit more general.
- ewwhite over 9 years
  
  @savoche you should mark an answer.
c4f4t0r over 9 years

i don't use partitions rather for root disk
savoche over 9 years

Multi-drive also lets you have (on ESXi at least) some disk files in independent mode. That way you can avoid including e.g. temporary data in snaps and snap based backups.
the-wabbit over 9 years

You do not need a reboot to let the Kernel re-read the partition table. But you would need to unmount the file system(s) on the resized device (which is tricky if it is the / partition). Other than that partition tables are rather serving documentary purposes - everybody and his uncle would run an fdisk -l (or the corresponding equivalent) to see what an unknown disk is about. If it is unpartitioned, it easily could be mistaken for "empty" and overwritten. This is the reason why I always create a partition table for disks. LVM is evil, though.
rrauenza over 9 years

That is not my experience on these specific VMs, although it has worked in the past on others. Unmounting the fs didn't free up the lock. Maybe it was just Centos5, I dunno. I was stumped. In a partition world, LVM is awesome. In the new btrfs/zfs world, it is obsolete. IMHO, of course.
Clément Perroud over 9 years

You've lost a lot of time describing what you did in the past (which is irrelevant for today), so you forgot to mention do you partition extra VMDKs or lay filesystem directly on top of them. BTW whats wrong with LVM? It has never failed me, maybe you didn't feel comfortable using it but it's an awesome addition to Linux (as long as we don't have native ZFS).
ewwhite over 9 years

Extra VMDKs being added as a mountpoint don't get partitioned.
ewwhite over 9 years

@JakovSosic Also, with LVM... I've seen a lot of things at scale that many Linux admins haven't. LVM doesn't pass the helpdesk test for me. I've seen enough LVM signature, performance, and misconfiguration issues to convince me not to use it in MY designs. I'm not saying it doesn't work. But my preference is ZFS anyway.
Clément Perroud over 9 years

At scale or not (1000 or 10 machines) - doesn't really have anything to do with LVM. Performance penalty occurs only if you misalign LVM which can be said of simple partitions and mkfs options too. So I really didn't hear any concrete argument against it...
ewwhite over 9 years

@JakovSosic This isn't the place, dude.
Clément Perroud over 9 years

Place for what? I'm just trying to get arguments against LVM and not personal opinions.
ewwhite over 9 years

@JakovSosic Please feel free to read LVM Dangers and Caveats
Clément Perroud over 9 years

Every technology has it's limits, and nothing on that page supports your claim that LVM is awful. I would recommend that you modify that part of your answer, because it's more of a FUD then beneficial info. PS. sorry if any of my comments sounded harsh, I use to write in between doing actual work so I don't often think about how my words can sound to others.
codenheim over 9 years

"Back in the day" was 2007? I was a complimentary license recipient at IBM in 1999 when version 1 shipped. I'm a VM dinosaur :D (waves @BrianKnoblauch). Per your LVM comments, sounds like you are judging it in context of Linux. LVM mature technology in commercial UNIX land for years before Linux. If you had managed top-end Solaris/Sparc/EMC Symmetrix, Linux was like a step down (and still is in many ways). In the days of small disks, LVM made multi-terabyte databases manageable. I've never had the problems you describe, which really sound like people problems, though I can certainly relate.
codenheim over 9 years

+1 in spite of the LVM bashing. Rest of the answer is good stuff from obvious experience.
GnP over 9 years

It took a while for me to realize you were actually using lvm inside the VM... Is there a reason you don't use LVM on the host and simply give the guest an lv to use as disk? The steps for resizing would be: resize volume in host, rescan on guest, resize2fs on guest.
GnP over 9 years

Why would you need a VG on an LV inside the VM? (note I'm rather new to LVM, I'm not judging your way, just trying to grasp the use of such a setup)
rrauenza over 9 years

Yes, inside the vm. Since this is under esx the virtual disk has to be a vmdk file. Yes, theoretically we could have used a raw disk in the guest.
RichVel almost 9 years

Using a bare disk is so much simpler - removes 2 steps out of 5, with no need to know LVM. Resizing an FS in LVM has been risky though it's getting better: LVM dangers and caveats.
Mircea Vutcovici over 6 years

You could use LVM filter on host to filter out nested LVs.
user2066657 almost 6 years

I like how you've specifically partitioned your swap after the /boot, something I only recently figured out (2008 or so). Keeping even one old bloated kernel image around causes modest /boot parts to stretch out, and feeding sda2 to /boot often gives it enough space. Having it where it is means no relocation of the PV holding root, and that saves a tricky operation that sometimes needs to be done remotely. :-)