An introduction to Microsoft Scale Out File Server

0 Flares Twitter 0 Facebook 0 LinkedIn 0 Email -- 0 Flares ×

Last week I attended the E2EVC Conference in Brussels. It’s an independent conference about virtualization, filled with technical sessions about many different technologies and platforms: there were some VMware sessions, but the majority of them were about Citrix and Microsoft. It was a great opportunity for me as a “VMware guy” to learn about “the other sides” of the virtualization world. Among the available sessions, Microsoft Scale Out File Server (SOFS) seemed from the beginning an interesting topic, and maybe a good fit for some environments, not only related to Hyper-V. So I went to listen to two sesions; the speakers at the conference were really good, especially Aidan Finn, and I was finally able to get a better knowledge of SOFS. I highly recommend this conference to all of you; they run many events per year all around Europe, check their website and look for the next one near to you.

How SOFS works

Aidan created a really clear and simple design scheme to explain the design. Instead of taking an ugly photo however, I searched on Internet for a better picture and I found this:

Microsoft SOFS

And thanks to this I was finally able after months reading around without success, to understand what SOFS is and how it works.

The basic concept is really interesting: a series of frontend nodes made with Windows 2012 R2 machines (up to 8 supported nodes) have multiple available SAS connectors, thanks to the installation of different SAS cards. With them, each of the nodes connects at the same time to all the available JBOD storage. In reality, the design can also be realized with iSCSI or FC connection, especially if someone wants to place a SOFS cluster in front of an old storage array. Anyway, the final result is a “full mesh” where each component, regardless it’s a front-end server or a back-end jbod, is connected to each other.

When I looked at this design, all of the sudden I realized I already saw it!

3PAR Active Mesh

Yes, this is the high level architecture of a HP 3PAR Active Mesh. Again, all nodes are active at the same time, and all of them are always connected to all other nodes and to ALL the disks available in the cluster. This is at a high level the same design you can create with Microsoft SOFS. The nice addition of SOFS is it can be built using standard servers and Microsoft software.

Don’t call it scale-out!

The name “scale out” here is what creates confusion. To me, it does not really fit into what is a scale-out storage architecture, and what for example I explained in this post I did months ago. And it was the reason why I was not able before to understand how SOFS works: one of the major scale-out concepts is share-nothing: each component is an independent element that does not share anything with the others, rather it replicates data and metadata with the other nodes to guarantee data redundancy and performances, and to minimize the failure domain to the single node. The design of SOFS is different, and as I showed before, it’s an active mesh more than a scale-out.

Nonetheless, it has some really interesting features, and as I said it can be a great solution for any SMB 3.0 based storage: surely for any Hyper-V cluster, but also as a file server with some “scale-out” capability.

For example, you can mix SSD and HDD and enable tiering. Tiering is really granular since it’s based on 1 MB blocks, even if as of today it’s not in real time, but it can only be scheduled. I’m expecting in the future a proper real-time solution. Another feature is the use of part of the SSDs (if present) as a write cache to speed up the storage. I don’t trust the published performance numbers since as I understood they were made with unreal I/O patterns like 4k blocks and 100% reads, but anyway the prove the solution can compete with other famous vendors.

An interesting part of the architecture is the internal data protection: there is no raid and all disks are directly managed by the system. By using Storage Spaces (that actually does NOT wants any raid to properly work) any data is replicated 2 or 3 times across the storage (it’s configurable, there is also a RAID5 configuration but it’s only suggested for archival). This obviously reminds me of other solutions using an object storage in the backend: no raid, any data splitted into chunks, and the chunks replicated multiple times across the array to guarantee availability and data protection. Storage Spaces has a modern design for sure.

Another awesome feature is the Automatic rebalancing of clients. SMB client connections are tracked per file share, and clients are then redirected to the cluster node with the best access to the volume used by the file share. This improves efficiency by reducing redirection traffic between file server nodes. Clients are redirected following an initial connection and when cluster storage is reconfigured. Think about large infrastructure with several users connecting to file shares, and this technology can really became a killer feature (or a huge problem for vendors of NAS filers…)

Final notes

Even if at first I was a little bit upset by the totally wrong name of the product, SOFS is a nice solution. By using commodity hardware and an operating system that many many people know how to use, the creation of a cheap and fast storage array can become really an easy task. This is probably, together with some features of the solution, the biggest selling point for Microsoft: it only takes some clicks in the interface or some easy powershell scripts (widely available on the Internet by the way) to configure and manage the cluster; when I compare it with some competitor using complex linux command line tool to do the same, I see a real opportunity for Microsoft to enter this market.

The product is young and it misses some important features (real time tiering, a proper scale-out architecture…), but Microsoft has entered the storage market with a good solution that thanks to the power of its vendor can really become a serious contender to other vendors. Will see in the next months and years what will happen.