Build a Microsoft Storage Spaces Direct cluster using VMware virtual machines

0 Flares Twitter 0 Facebook 0 Google+ 0 LinkedIn 0 Email -- 0 Flares ×

I’ve always been a fan of scale-out storage architecture, I’ve always said that The future of storage is Scale Out, and I’ve spent a fair amount of time studying software-only solutions like Ceph. The new solution from Microsoft, Storage Spaces Direct, seems like another great technology that will be soon available to us, so I decided to test it in my lab.

Storage Spaces Direct

Storage Spaces Direct is a new share-nothing scale-out storage solution developed by Microsoft, that will be soon available as part of Windows Server 2016. If you want to learn more about it, this one is a really good starting page. To test this solution using Windows Server 2016 Technology Preview 5, I’ve decided to run the entire solution nested in my VMware lab. There are some configurations and steps that need to be taken to make it run inside virtual machines. The final result is going to be a 4 nodes cluster, as this is the minimum amount of nodes that is required.

Storage Spaces Direct can be shortened as S2D, and it’s the name you will see in this article.

The virtual machines

For my lab, I’ve built 4 different virtual machines, with these hardware specifications:

4 vCPU
System disk 40 GB
Two network connections

In addition to this, I’ve added 4 hard disks. They are all connected to a dedicated SCSI controller, that is configured with Physical SCSI Bus Sharing:

SCSI controller set to Physical SCSI Bus sharing

This is paramount to guarantee the correct identification of the disks by the Storage Spaces wizards.
The 4 disks connected to this controller has then to be created as Thick Provision Eager Zeroed, otherwise the Physical SCSI Bus Sharing cannot be used. So, be careful about the storage consumption as these disks are going to be completely inflated from the beginning. Then, I needed to configure the 30GB disk as an SSD. On a virtual machine, this can be done by adding:

to the VMX configuration file. With this parameters, the Guest OS can recognize the disk as an SSD, and use it later as the caching tier for Storage Spaces Direct. Windows 2016 also properly identifies the other disks. Without the Physical Bus sharing in fact, there will be for each disk this error during the cluster validation:

In fact, without setting this option, during the validation of a new node in the cluster, you will see this error in the Storage Spaces Direct configuration:

By using the proper bus sharing, the disks are correctly identified:

Disks are correctly identified

This can also be appreciated using Powershell. With regular disks and no BUS Sharing, this is the output:


With thick disks and Physical Bus Sharing, this is the output:

Windows 2016 TP5 is installed on all the four nodes, and they are all joined to my domain. There are two networks on each node, and the final configuration is like this:

We are now ready to build the cluster.

Build the cluster using powershell

In order to speed up things, and get a consistent result, I’ve decided to build the new cluster using Powershell. Also, you will see later there are some steps that need specific options that may not be available via the graphical interface.

First, on each of the four nodes we install the needed components:

Then, we go and we create the new cluster:

If we then proceed and validate the cluster, either via the graphical Failover Cluster Manager or again with Powershell using:

we will notice that the section “Storage Spaces Direct” has a result of Failed. The reason is this one:

Virtualized HDD are recognized as Unspecified

But as I said, there is a workaround.

Next step, we check the Cluster network, and we configure the two available networks to be available for clients (frontend) and for cluster internal communications (backend):

Cluster networks

The tests added in TP5 run as we said some SCSI commands, that fail on a virtual disk. We can work around this by turning off automatic configuration and skip eligibility checks when enabling S2D, and then manually create the storage pool and storage tiers afterwards:

Then, we create a new storage pool:

and we configure all the virtual disks by making them appear as proper HDD, as Storage Spaces Direct as seen in the error above, accepts only SSD and HDD. Before we have this situation:

With this command we force these disks to be marked as HDD:

If we check again the available disks, we have this new situation:

The final result is the pool correctly created and ready to be consumed:

The new storage pool

Virtual disks and volumes

Now that the cluster is created, it’s time to create our first volume and use it. For this part of the post, I’ll go back to the graphical interface, so I can explain a little bit the different available options. To start, with the pool selected, we start the wizard to create a New Virtual Disk. After selecting S2D as the storage pool to be used, we give the virtual disk a name and select to use tiers:

Enable tiers for the new virtual disk

We accept enclosure awareness, and we configure the storage layout as this: mirror for Faster Tier and parity for Standard Tier, and for the resiliency settings two-way mirror for Faster Tier and Single Parity for Standard Tier.

Then, we configure Faster Tier size at 50GB, Standard Tier size at 500 GB, and we disable the Read Cache. We confirm all the selections and the disk is created. Before closing the wizard, we select the option to immediately create a volume: file system will be ReFS and it will use the entire size of the virtual disk:

Virtual disk and volume are created

Last step of this part, we select the virtual disk and use the command “Add to Cluster Shared Volumes”.

File server and shares

Now, as we want to have at the end of the test a working file share where we can drop our files, we need to create a role in the cluster, in this case a File Server. A simple powershell one-liner is all we need:

Then, we create the share. In the nodes of the cluster, there is a mount point for the newly created volume, in C:\ClusterStorage\Volume1. We will use this location to create our new share:

The share can be reached now over the network using the UNC path \\SOFS\Repository, and we can read and write data to it.

To test the resiliency of S2D, I’ve done this simple test. I started to copy some large ISO files to the share, and while the copy was going on I powered off directly from vSphere the node ssd3, at the time owner of the File Server role. The role is immediately passed to ssd2, and the file copy goes on without any interruption.

Final notes

I’ve really enjoyed the time I’ve spent to play with Storage Spaces Direct. The issues to make it work in a virtualized environment are not important, as in a production environment I’m expecting people to use physical servers among those listed in the hardware compatibility list that Microsoft is preparing. The configuration of the solution is really simple, and the failover capabilities are really reliable. When Windows 2016 will become Generally Available later this year, I’m expecting many IT admins to start thinking of it as a new possible solution to create a scale-out storage, especially in situations where SMB3 is the needed protocol.

0 Flares Twitter 0 Facebook 0 Google+ 0 LinkedIn 0 Email -- 0 Flares ×

10 thoughts on “Build a Microsoft Storage Spaces Direct cluster using VMware virtual machines

  1. is that 4 disks per node or 4 disks shared across the 4 nodes?

    • It’s 4 disks per node, as you can see from the virtual hardware of each node. There’s no shared disk in this kind of design.

      • Thx Luca for the response. Now I am getting this it doesn’t know the -cacheMode parameter. Have you seen this before?

        Enable-ClusterStorageSpacesDirect : A parameter cannot be found that matches parameter name ‘CacheMode’.
        At line:1 char:19
        + Enable-ClusterS2D -CacheMode Disabled -AutoConfig:0 -SkipEligibilityC …
        + ~~~~~~~~~~
        + CategoryInfo : InvalidArgument: (:) [Enable-ClusterStorageSpacesDirect], ParameterBindingException
        + FullyQualifiedErrorId : NamedParameterNotFound,Enable-ClusterStorageSpacesDirect

        • Actually, found you don’t need to use -cachemode parameter with the most Windows 2016 version. They say it is bundled into the enable-cluster2d commmand, i just ran it with the autoconfig and skipcheck. i just have an issue now it says, ‘No disks found to be used for cache”

          • It may be, my tests were done with TP5 and I never update the tests with 2016 GA version, so some commands may have changed indeed. Thanks for pointing this out.

          • Luca – looks like I can’t get past that “no disks found to be used in cache”. Seems I do need to disable cache mode but from what I read, the updated cmdlet it isn’t doing that for enable-clusters2d -cachemode. Any ideas?

          • Luca,

            One more question. It seems in Failover Mgr, I can see the physical disk as SSDs in the physical tab when I select Nodes section. But it won’t see any disks to add disks from the storage folder.

            “no disks suitable for cluster disks were found. run validation wizard again”.

Comments are closed.