What are the drawbacks of running a database inside a virtual machine? How do I overcome them?

16,993

Solution 1

Though many DB vendors were very slow to do this, nearly all of them now officially support their software running in a virtualized environment.

We run many Oracle 11g instances in linux on top of ESXi, and it is certainly possible to get very good performance. As with all hardware scaling, you just need to make sure that the virtualization host has plenty of resources (RAM, CPU), and that your disk layer is up to the task of delivering whatever IO performance you require.

Solution 2

As ErikA says, this is becoming more and more common. I'm in the SQL Server camp and don't personally have any production systems running in VM's, but I would not be hesitant to (after a little more study on the topic). There are definitely some things to take into consideration before you go down that path, though (at least for SQL Server). Disk IO (as others have mentioned) and memory allocation are just 2 examples. Things will be different between different hypervisors as well.

Brent Ozar is a recognized expert in virtualizing SQL Server, specifically in VMWare. I would highly recommend reading through his material.

http://www.brentozar.com/community/virtualization-best-practices/

Solution 3

There is can and then there is should. A corvette can go 150 mph, but should you on public highways? You can harm yourself unnecessarily.

Databases are guest operating systems. By design when they start they grab blocks of a resource and manage it directly for performance reasons. As soon as you make the core operating system of the database server a guest in virtualized hosting environment then you are placing an arbitration layer with the hypervisor between the block allocated element of disk and RAM and the database server. It will slow down. The more inefficient your queries, the more it will slow. These inefficiencies may be masked today on dedicated hardware, but as soon as you introduce arbitration to your dependent resource you are going to find out real fast.

What a lot of bean counters who are demanding virtualization fail to recognize is that database servers, as guest operating systems, offer their own consolidation layer. There is no reason why you cannot move consolidate multiple logical database instances on one physical server, even to the point of moving IP addresses, setting up additional host names, etc..., to allow for this natural coalescing of services to take place. And, with this model not only do you retain the cost savings that the management is pushing for reduced number of physical hosts, but you retain the block access to physical resources without the impingement of the arbitrary hypervisor, which can make beneficial decisions sometimes and not others.

The same holds true for other guest operating systems, like Java. Virtualization solutions are typically busy environments and the hypervisor has to make lots of decisions on who "gets the token" on a resource. Anytime you can eliminate that layer you are going to be better off.

Coalesce multiple instances using the natural guest operating system layer first. Odds are you will be able to hit your platform consolidation and performance targets easier.

Solution 4

There are two things to realize here:

  • Unit of DB performance per unit of Hardware is a bit lower for a virtualized db. This means you need to buy a little more hardware to get the same level of performance.
  • That doesn't mean the same level or a desired level of performance is unobtainable. The gains you get from improved management and other benefits (like easier HA) often way more than offset the marginally increased hardware costs.

That said, where I work our Sql Server installation is one of only two servers that I have no intention of virtualizing any time soon (the other is the primary DC).

Solution 5

Running SQL Server is a VM will be fine, provided that you can provide enough resources to the VM to run your application. If in the physical world you need 24 cores and 256 Gigs of RAM then you need to provide 24 vCPUs and 256 Gigs of RAM in the virtual world.

I just wrote an article in last months SQL Server magazine all about running SQL Server under VMware's vSphere.

Share:
16,993

Related videos on Youtube

Russ
Author by

Russ

Updated on September 18, 2022

Comments

  • Russ
    Russ over 1 year

    Running anything inside a virtual machine will have some level of performance hit, but how much does it really impact the performance of a database system?

    I found this academic reference paper with some interesting benchmarks, but it was a limited test using Xen and PostgreSQL only. The conclusion was that using a VM does "not come at a high cost in performance" (although you may think the actual data says otherwise).

    What are the technical, administrative, and other drawbacks associated with running a database within a virtual machine?

    Please post answers that can be backed up by objective facts, I'm not interested in speculation or any other semi-religious argument (geek passion is good in many ways, but that won't help us here).

    That being said,

    • What issues show up when running database in a Virtual Machine? (please post references)
    • Are those issues significant?
      • Are they only significant under certain scenarios?
    • What are the workarounds?
    • Juanjo Daza
      Juanjo Daza over 12 years
      +1 I'm primarily interested in hearing feedback about SQL Server and Windows 2008 R2 scenarios
    • Russ
      Russ over 12 years
      @Shane Madden - Can you please explain the closure a bit? I expect that the motivation was driven by one non-specific answer (which then got derailed in the comments), not the question itself. Regarding the question, 44 votes and 12 favorites within roughly one day of pre-closure existence implies to me that it was a good question with useful answers/information (especially compared to what seems to be typical for ServerFault question traffic). This is what the various SE sites are aiming at. Would you have preferred a more specific question phrasing, vs the loose "how bad is it?".
    • Juanjo Daza
      Juanjo Daza over 12 years
      @ErikA ,Shane ,Womble ,mikeyb ,Ben - I made a community edit that may make this question more constructive. Do consider reopening this, or posting a similar question on a new/clean question.
    • Alex
      Alex almost 4 years
      At almost 10 years later, I'm curious about how things look today.
  • Dave M
    Dave M over 12 years
    +1 As noted, Critical that resources be up to the task. Disk has been the big bottleneck for us and carefull planning is needed.
  • Rajshree Gupta
    Rajshree Gupta over 12 years
    +1 You need to do your homework on the database usage ahead of time. If your physical box is getting hammered above 40% utilization then your advantages for vm'ing it start to dissolve. That being said we have tons of small application-specific isolated sql's running on vm's with no problem. But our large heavy-usage machines have dedicated hardware because of the lack of advantage.
  • lynxman
    lynxman over 12 years
    Definitely Disk IO is the big culprit, and what virtualised environments tend to be flaky at.
  • EEAA
    EEAA over 12 years
    @lynxman - Agreed. We run all of our Oracle instances on our Tier 1 SAN disks, which are 15k SAS. From what I can tell, we get very close to near native performance.
  • AngerClown
    AngerClown over 12 years
    I assume you need to dedicate SAN LUNs to each VM and potentially a single fibre port to each VM as well? VT-d is probably key too, correct?
  • EEAA
    EEAA over 12 years
    Our database servers typically get their own VMFS lun, but the FC ports are shared with all the other VMs that happen to be on the host. Single LUN-per-VM doesn't matter much in our case anyway, though, as our SAN (Compellent) stripes LUNs across all the disks, so they're all shared anyway.
  • Chris B. Behrens
    Chris B. Behrens over 12 years
    "An ounce of test is worth a pound of guess."
  • ravi yarlagadda
    ravi yarlagadda over 12 years
    Interesting definition of "guest operating system." While your point is taken with regard to pure, unadulterated performance, how often do your databases really bottleneck at the CPU? I/O is much more likely, and for higher performance applications you're already sharing I/O time at a SAN. I would hope that you'd reconsider your virtualization philosophy when a security issue with one application compromises all of your consolidated databases' password hashes, or when one process running within your JVM consumes every byte of available heap space.
  • Jake Oshins
    Jake Oshins over 12 years
    I disagree with your point about always going to the existing consolidation layers first. Sometimes that make sense. But look, for instance, at the cost tradeoff in rebalancing resources between consolidating multiple databases on a single OS and consolidating multiple database/OS combinations on top of a hypervisor. The first is more efficient. The second is much easier to rebalance. Migrating and OS/database to a new host is much less disruptive than migrating a database to a new OS.
  • James Pulley
    James Pulley over 12 years
    My comments come from direct in-the-field observations of successful and failed migrations to virtualization solutions over the past decade as a performance engineer. There are tons of bad database apps out there whose promiscuous use of hardware masks performance issues. Add virtualization and those issues come to light. If you have an app which demands a precise clock for timing or audit purposes, then with the clock float in software virtualization you are out of the hunt.
  • James Pulley
    James Pulley over 12 years
    With the bean counters making the push, the trend is oversubscription on the virtual machine hosts, which pushes the hypervisor decisions on resource allocation to almost universally poor to all of the guests. The hypervisor layer is also not as robust on the throughput front as the standard OS drivers so you do suffer a loss in maximal throughput vs the standard non-virtualized interface.
  • James Pulley
    James Pulley over 12 years
    Databases, JVM's, etc... are defined as guest operating systems for the fact that they provide their own namespace for access to resources, they manage resources directly in a block and can run software defined for these environments. Databases also tend to have their own file systems for the storage of data. I am not totally virtualization averse as I use the technology every day as a part of a services delivery practice, but where performance is the primary concern (as it is in my field) I do not recommend or deploy virtualized solutions.
  • James Pulley
    James Pulley over 12 years
    For many organizations the use of virtualization, particularly in Microsoft environments, is a lazy solution for platform consolidation. They do not see a clear path to retain the domain or internet namespaces resident on client computers for access to services on remote hosts and so virtualization is an easy solution. Where you have the right knowledge you can easily roll up dozens of individual computers to a single host without virtualization, retain the namespaces, even retain the IP addresses if you wish, and keep client computers blissfully ignorant of the change.
  • James Pulley
    James Pulley over 12 years
    Unfortunately neither Microsoft nor Novell really embraced fully the directory model for both administrator and user namespaces as Banyan pioneered, for if they had platform consolidation would have been very easy. An admin could then simply migrate a service from one host to another and retain the same logical namespace as resolved by the directory server without users having any knowledge of systems being changed in the background.
  • EEAA
    EEAA over 12 years
    Wow, just wow James. I don't have the time nor patience to trash all of the points you made in your answer and subsequent comments, but I just felt I needed to drop a comment here for anyone that might happen upon this answer. James's views are, well, his own, and don't reflect what is truly possible. If you're oversubscribed then of course you're going to have poor performance. So don't oversubscribe. It's perfectly possible to have a very high-performing virtualization environment. It's folly to make a blanket recommendation against it because it "performs poorly".
  • James Pulley
    James Pulley over 12 years
    "I am not totally virtualization averse as I use the technology every day as a part of a services delivery practice, but where performance is the primary_concern_(as_it_is_in_my_field) I do not recommend or deploy virtualized solutions"
  • ravi yarlagadda
    ravi yarlagadda over 12 years
    @James But you're making the generalization that every virtualized environment is oversubscribed due to overzealous bean-counters, and making the assertion that the few-percentage-points difference from native performance is a deal-breaker for most database loads. I understand where you're coming from, but your assertions don't apply well to the modern IT industry as a whole.
  • James Pulley
    James Pulley over 12 years
    I only have a decade worth of in-the-field observations to draw from. Where I am generally called primarily falls into two categories related to poor performance, environments are horribly oversubscribed, with IT decisions being driven by accounting managers unfamiliar with technology and with applications which are poorly designed/have high clock dependencies which are not well sorted for performance. In all cases the delta from physical to virtual us more than a few %. In some cases both are present. My observations may be biased by spending most of my time fixing these issues.
  • James Pulley
    James Pulley over 12 years
    Do not get me wrong, I am not trying to "blame" virtualization here. I am a very happy user of the same technology in specific areas of my IT infrastructure. There are bad IT decisions chasing cost savings being lead by non-IT folks and bad applications that are leading the way to poor performance in virtualized/cloud environments. These bad decisions keep my organization very busy.