Hyper-V thin provisioning at Virtual and Physical Layers

by

Thin provisioning is the process of making a host think it has more available storage capacity than it actually has by allocating the needed disk space on demand. This mechanism uses storage space more efficiently and helps prevent against the wasting of resources on reservations that could never be used.

The benefits of this provisioning are clear, however, there are some downsides to it as well. Let’s take a look at a real-life example. We have an application that physically requires 200GB worth of capacity. Within Hyper-V we create the operating system and then attach our second data disk, VHDX. When adding the disk, you have the option of creating either a fixed disk or a dynamic disk. The dynamic disk allows you to specify 200GB worth of capacity to the Guest VM and operating system, but in actuality it’s dynamically expanding out as data within the application is written.

The fixed disk type physically writes out zeroes to the 200GB so the space is immediately consumed. As previously mentioned, this presents challenges. First, you might consume the maximum space allocated in the dynamically expanding VHDX or even consume all of the capacity on your storage array at some point. What happens under the hood with thin provisioning is that you're making the operating system think it has capacity, but in actuality it doesn't know that the storage is completely exhausted.

 

Figure 1. New Virtual Hard Disk Wizard in Hyper-V Manager
Figure 1. New Virtual Hard Disk Wizard in Hyper-V Manager

 

Once your virtual hard disk outgrows the space available, your LUN won’t have any operating room left and your VMs will suffer down time. Likewise, when you create a 50 TB LUN on a SAN that only has 30 TB in total capacity, you’ll be in trouble when you actually try to copy that much data on to the LUN.

Not all storage devices support thin provisioning, so make sure to check with your specific vendor to validate its supported devices. Now let’s take a look at this 10.5 TB LUN on a SAN where we first disabled snapshots storage level. This shows that the actual space we are using on the SAN is 402 GB, the rest of that 10.5 TB is not allocated.

 

Figure 2. A 10.5 TB LUN with 402.01 GB of actual storage consumed on the SAN
Figure 2. A 10.5 TB LUN with 402.01 GB of actual storage consumed on the SAN

 

We’ll delete about 200GB of files and see that amount of space reclaimed on the SAN.

 

Figure 3. The space reclaimed after UNMAP operation
Figure 3. The space reclaimed after UNMAP operation.

 

Notice that your SAN is often more efficient at identifying storage space that is actually used than the operating system. The OS is not aware that a fixed VHDX is only really consuming 10 GB out of 50 GB or 127 GB, but the SAN knows. The same applies for dynamically expanding VHDX files - on which space has been recovered on the SAN, but where the file hasn't shrunk yet. So, the OS reports way higher storage use on a CSV LUN full of VMs with fixed VHDX size or dynamically expanding VHDX files than the SAN. As an extreme example, take a look at the same LUN as above.

 

Figure 4. The OS size reported by CSV vs the actual space consumed on the SAN
Figure 4. The OS size reported by CSV vs the actual space consumed on the SAN.

 

There’s a nice collection of fixed VHDX files on this CSV that are filled with just an OS and nothing more. So, they only consume the needed space on the thin provisioned LUN within the SAN, but Windows doesn’t know any better and reports the actual fixed VHDX file sizes.

 

Figure 5. Windows reporting 2.59 TB in use while in reality only 402 GB is consumed on the SAN
Figure 5. Windows reporting 2.59 TB in use while in reality only 402 GB is consumed on the SAN.

 

With Windows Server 2012, Microsoft introduced the VHDX file format, which brings better performance, resilience and data protection. Even if you don’t have a storage array that provides you with thin provisioning, you can enjoy its benefits due to the dynamically expanding VHDX format. No storage capacity is consumed until you actually put data on these virtual disks. Then the virtual disk will pull blocks of space from the storage capacity to match the needs of the written data amount.

 

Figure 6. The Hyper-V host reports the real size of the dynamically expanding virtual disks consumed
Figure 6. The Hyper-V host reports the real size of the dynamically expanding virtual disks consumed.

 

The difference in size between an empty VHD and VHDX is obvious due to the new internal structure of the VHDX, which grows by default in block sizes of 32 MB versus 2 MB with a VHD. It also allocates blocks as a buffer, so it can fill requests fast while it expands with new blocks. The file sizes on the Hyper-V host reflect this when we write data to the dynamically expanding virtual disks.

 

Figure 7. The real size of the dynamically expanding virtual disks on the Hyper-V host has grown
Figure 7. The real size of the dynamically expanding virtual disks on the Hyper-V host has grown.

 

An important item to note here is that when fixed VHDX files reside on a thin provisioned LUN on a storage array, they also do not consume that space either. Only when data is written to them.

The operating system reports the size of the LUN as assigned inside of the VM, not the real size of the dynamically expanding virtual disks. The amount of free space is never consumed on the host. This will not cause any issues as long as the space assigned is available when needed.

 

Figure 8. The OS reports the size of the LUNs that are assigned and that are free space inside the VM
Figure 8. The OS reports the size of the LUNs that are assigned and that are free space inside the VM.

 

Speaking about virtual disks, I would recommend you use the VHDX format over VHD, unless you plan to downgrade to Windows Server 2008 or earlier. The overall performance is better and it provides consistency and protection against power failures. Also, the VHDX format supports 64 TB virtual disks, while the VHD format only allows for 2 TB. Moreover, shared VHDX allows you to share virtual hard disks between more VMs in order to provide availability for a virtualized Windows Server Failover Cluster.

Technology is developing rapidly, and today it is perfectly fine with most vendors to combine thin provisioning at the physical and virtual layer. The combination of dynamically expanding virtual hard disks and thin provisioned LUNs on a SAN complement each other to provide many benefits, such as reducing storage costs through better resource utilization, flexible storage planning or fast recovery points.

To find out more about thin provisioning, I recommend the whitepaper, Storage Efficiencies with Hyper-V at the Virtual and Physical Layer.

Get 30 days trial of Veeam Backup & Replication now

About the author
VeeamVeeam Logo
Cristian-Antonio is a Veeam Specialist certified in Oracle database design and programming with SQL. His background in IT Technology and Web Development, along with his passion for new technologies, have driven his interest in virtualization and cloud computing. Cristian-Antonio is enthusiastic on sharing his experience and new findings with the IT community. Follow Cristian-Antonio on LinkedIn