VMware vSAN addresses the many challenges in modern data centers.
In this blog post, I would like to introduce you to the components under the hood and also show you how to write and read operations are performed.
vSAN is an object-based storage system that manages or allocates data in the form of individual objects. The various object types that vSAN relies on include the following:
- VM home namespace (VMX, NVRAM, log files).
- VM swap object(s)
- Virtual hard disk(s) (VMDK)
So when we look at how vSAN stores the hierarchy of objects, we have another term: component. What is a component? Basically, each object in a vSAN cluster is made up of one or more components. To make it easier to summarize and visualize, in the figure below we have a VM with some objects and each object has some components.
Components can grow up to 255 GB in size. This means that if a virtual machine has a disk (VMDK) of 300 GB, it consists of at least two components.
There is also the possibility to control the distribution of the components to the physical backend storage media in a fine-granular way via storage policies.
A standard vSAN requires at least three hosts in the vSphere cluster. In certain scenarios, more precisely in ROBO environments, it can also be two hosts, but then a so-called witness appliance is required at a 3rd location. A disk group must consist of at least 2 disks and can scale up to 8 disks. Each host in the vSAN cluster can have up to 5 disk groups.
The objects and the components of which they consist of the vSAN are consequently stored on these disk groups.
For backend communication, the cluster requires a VMKernel interface with the vSAN service enabled.
To ensure the latency and bandwidth requirements demanded by vSAN, a 10Gbps network connection between the hosts is strongly recommended. For hybrid vSAN deployments where disk groups consist of SSDs and HDDs, a 1Gbit/s can also be used.
Licensing, as is typical with VMware, differentiates between Standard and Enterprise.
- 3x hosts
- VMKernel interface for vSAN enabled.
- Corresponding license
- Network bandwidth of at least 1 Gbit/s or recommended 10 Gbit/s.
- Within each disk group, you must have at least one SSD for the cache tier and one or more SSD/HDD disk(s) as the capacity tier.
From the minimum one SSD required per disk group, it can be deduced there are two deployment methods of vSAN:
Hybrid & All-Flash
In the hybrid deployment option, as the name suggests, the flash disk in the caching layer is used for both read caching and write buffering. To understand how caching works, you need to know that 70% of the cache capacity is used for read caching and 30% for write caching. The actual flow of writing data to the vSAN datastore includes the following steps:
Write Flow in Hybrid Deployment
- a SCSI write command is issued by the guest OS within the VM.
- the command first goes to the caching layer and immediately sends an acknowledgement back to the guest OS. This is a caching algorithm called write-back caching. Therefore, when writing to the datastore, the VM experiences flash disk performance.
- The last step is a process called destaging. The cache-level data must be transferred to the back-end capacity disks to avoid data loss in the event of a power failure or cache device failure. Does this happen immediately? The answer is no. vSAN uses some algorithms to do this.
Read Flow on Hybrid Deployment
- A SCSI read command is issued by the guest operating system.
- The first thing vSAN checks is the cache level. If the data is cached, the request is processed and sent to the VM. If we have a high cache hit rate, it means better read performance.
but when the data is not in the cache, which is called cache miss.
- it requests them from the capacity disks.
- Then it copies the data from the capacity disk to the cache disk.
- now it is time to send it to the VM.
In all-flash deployment, both the cache and capacity planes use flash disks, with one difference from the hybrid option. There is no space for the read cache, and the entire capacity of the cache device is used for write buffering, while read requests are served directly by the capacity disks.
Write Flow in All-Flash Depoyments
- The guest OS issues the SCSI write command.
- It will hit the cache device and sends an acknowledgment to the guest OS.
Read Flow in All-Flash Deployments
- The guest OS issues a read SCSI command.
- vSAN looks for the data in the cache. why? The caching tier is only used for write buffering. That’s right, but there can be a condition where the requested data is a non-destaged data which resides on the cache tier. if yes, it services the request.
- If the data is not in the cache device, it tries to find it in the capacity device and sends it to the VM.
In my next posts, I will cover more details of the vSAN architecture. I hope this has been informative for you and if you would like to see more vSAN related content feel free to leave a comment.