Optimizing for Virtualization, Part 2
by Liz van Dijk on June 29, 2009 12:00 AM EST- Posted in
- IT Computing
Layout changes: noticing a drop in sequential read performance?
When considering virtualization of a storage system, an important step is researching how actual LUN layout changes when migrating to a virtualized environment. Adding an extra layer of abstraction and consolidating everything into vmdk-files can sometimes make it complicated to keep up with how it all maps to actual physical storage.
By consolidating previously separate storage systems into a single array, servicing multiple VM’s, we notice a change in access patterns. Because of queueing systems on all levels (guest, ESX and the array itself), what are actually supposed to be sequential reads coming from a single system get interleaved with the reads of other VM’s, resulting in a random access pattern. Keep this in mind when you notice a drop in otherwise solidly performing sequential read operations.
An important aspect about these queues is the fact that they can be tampered with. If, for example, one would like to perform a test on a single VM that really requires it to get the most out of the LUN as possible, ESX is able to change its queues to temporarily fit the requirements. As it is, the queues’ standard size is 32 outstanding IO’s per VM, which should be optimal for a standard VM-layout.
Linux database consideration
Specifically for Linux database machines, there is another important factor to consider (Windows takes care of this automatically). This is something we need to be extra careful about during the development of vApus Mark II: the Linux version of our benchmark suite, but anyone using databases under Linux should be familiar with this. It is generally recommended to cache as much of the database in the database cache, rather than allowing the more general OS buffer cache to take care of it. This recommendation once more plays a big role when virtualizing the workloads, as managing the file system buffer pages is more costly to the hypervisor.
This parameter should be set from inside the database system however, and usually comes down to configuring O_DIRECT mode as the preferred method of approaching storage (in mysql, it comes down to setting the innodb_flush_method to O_DIRECT).
13 Comments
View All Comments
zdzichu - Tuesday, June 30, 2009 - link
True, for for quite some time Linux is tickless and doesn't generate uneeded timer interrupts. This change went into 2.6.21, which was released TWO YEARS ago. http://kernelnewbies.org/Linux_2_6_21#head-8547911...">http://kernelnewbies.org/Linux_2_6_21#h...47911895...yknott - Tuesday, June 30, 2009 - link
Technically Linux is NOT tickless, dynaticks only mean that when there are no interrupts occurring and the cpu is idle, there are no timer interrupts fired. When the CPU is in use, tick interrupts are still fired at 1000hz.To your point, this is still a huge advantage when it comes to virtualization. Most of the time CPUs are idle and not having the underlying VM hypervisor process ticks from each VM that is idle will allow for more processing power for the VMs who DO need the CPU time.
I also agree that RedHat definitely needs to keep up with the kernel patches. I understand that there is some lag due to regression testing etc, but two years seems a bit much.
yknott - Monday, June 29, 2009 - link
Thornburg,I think what Liz was talking about has to do with the tick interrupt under Linux. Since the 2.6.x kernel, this was set to a default of 1000hz or 1000 times a second.
I don't believe you shouldn't use linux, as you can change this tick rate either in the kernel or at boot time. For example, under RHEL 5, just set divider=10 in your boot options to get a 100hz tick rate.
You can read more about this on VMware's timekeeping article here: http://www.vmware.com/pdf/vmware_timekeeping.pdf">http://www.vmware.com/pdf/vmware_timekeeping.pdf
Checkout page 11/12 for more info.
Liz, while that paragraph makes sense, perhaps it doesnt tell the whole story about tick rate and interrupts under vmware. While I agree that running at a lower tickrate is ideal, perhaps mentioning that the interrupt rate is adjustable on most OSes.