Since it has come out of stealth mode in 2009 I always looked with interest at Solidfire, because they offer a really unique storage product, way different from all other vendors, and surely interesting for those like me working in service providers. Some weeks ago I was able to sit down for a nice chat with some of the Solidfire guys, and be finally able to deep dive into their product and technology.
Born for Service Providers
The Solidfire story is related to Dave Wright, founder years ago of Jungle Disk; a sync & Share that I used myself for several years, acquired by Rackspace in October 2008. By working inside such a huge service provider like Rackspace, he had the opportunity to see how storage was used, and most of all what was missing in a storage specificaly designed for Service Providers.
Solidfire has born with this exact focus: it’s a scale-out storage, specificaly designed for Service Providers.
The base is a Block Storage, even if it reminds an Object Storage by they way it works. Each block is saved twice in different parts of the storage, using a protection scheme called Double Helix, and there is a complete separation between data and their describing metadata. Starting with a cluster with minimum 5 nodes, the architecture can be scaled linearly to reach impressive values in terms of disk space and IOPS; the numbers say 3.4 PB and 7,5 millions IOPS for their biggest model, the SF9010, in a cluster with 100 nodes. Each node is completely independent from the others, following another design principle of Object Storage, that is “Share Nothing”.
Each node is a simple Dell 1U server, loaded with 10 SSDs and two 10G ethernet connections. By choosing Dell, Solidfire removed any problem related to hardware support, and so being able to focus on the core of their solution, that is (as usual) into the software. They install on those servers their own operating system, called Element OS, where all their own software is running. Finally, the cluster exposes to servers different LUNs via the iSCSI protocol.
There are at the moment three models, called SF3010, SF6010 e SF9010. They are basically the same, but each of them has different size of SSDs, respectively 300, 600 or 960 GB each; also the 9010 has a performance improvement over the other two. Today is not possible to mix different models in the same cluster, but this is a possibility that will be available later this year. This is a nice-to-have addition, because it can protect a customer’s investment in those cases when one started with the smaller model and now want to add one of the bigger ones.
It’s all about OPEX…
There is no doubt a Service Provider, while offering services to its customers, has to be really focused on costs optimization. Solidfire, as said, has been designed right for those customers, and many of its optimizations are thought right for those use cases.
First of all, the huge amount of I/O guaranteed by SSDs is not used as a mere performance benefit. Each node does not have such impressive IOPS values, the numbers are between 50 and 75 thousands IOPS. With 10 SSDs, this honestly sounds like a poor value. However, SSD performances are used to guarantee predictable and steady IOPS and latency in any situation, even when a component fails.
The economic efficiency is then guaranteed by a complete list of data reduction technologies. We all know SSD has a price per GB that is much higher than mechanical disks, and also in order to extend their life they require a high level of optimization of write activities. First there is deduplication; each block is only written once, and metadata are responsible for pointing to it every time a virtual machine uses that block; on the disk there is only one copy (plus its replica). Then, there is compression, and finally all LUNs are thin provisioned.
Solidfire claimes these optimizations offer a 2:1 savings each, and that this is a conservative value. If you add the replica, the final savings are 4:1 compared to raw space; there are situations where this value is much higher, for example on vCloud environments heavily relying on templates and fast provisioning.
Another saving comes from the smaller footprint in the datacenter: a single SF3010 node has 3 TB raw, that become 12 once optimized; the SF9010 has up to 38 TB. This means a 42U rack completely filled with Solidfire (supposing that you can cool everything at this density, and most of all you have enough electricity per rack to power everything up; SolidFire told me a fully populated 40 nodes cluster uses 12Kw for the 3010/6010 models and 18Kw for the 9010) gives you between 500 TB and 1600 TB of disk space. Those are values no other solution based on mechanical disks can give you in the same amount of rack units. Also, SSDs makes the storage highly efficient from a thermal point of view, so it could offer you further savings on cooling costs.
I have only a doubt, and is about connectivity expenses: two 10G connections per 1U are a lot, and the price of 10G ports is not cheap. For sure the choice to have such a small fault domain is helpful in gaining scale-out performances, but on the other side the price for connecting all ethernet cards is an issue. Probably, even if the acquisition cost is higher, the best choice is to start directly with the bigger model SF9010, since it has the highest density for the same rack usage and number of ethernet ports.
As you can see at the end, all these observations are not so common when we talk about “usual” storage systems, but for a service provider are really important.
A quick look at prices. There is no public list price, but a general statement from Solidfire is “less than 3 USD per GB”. Based on this value, I did a quick simulation that was confirmed by Solidfire itself. The smallest cluster you can start with is made with 5 nodes SF3010. With 10 * 300 GB SSDs each, the total raw space is 15 TB. By applying the savings ratio 4:1 described before, the total usable space is 60TB. At 3 USD per GB, the total cost of acquisition is then around 180.000 USD. Real prices may differ, and the are low if you start with bigger configurations.
In OPEX terms, considering a 3 years life cycle, the price per GB per month is 8,33 USD. In comparison, AWS offers “Provisioned IOPS” disks at 12,5, and even more they are limited at 4000 Max IOPS and they have no QoS or minimum guaranteed value.
QoS: an architecture, not a feature
There is no doubt the main feature of the Solidfire solution is their QoS (Quality of Service): unlike other storage systems, that have only some features regarding IOPS management, this solution has been designed from the beginning with QoS in mind; the starting idea was to solve the issues about bringing enterprise workloads into “cloud computing” right because there was no guaranteed about storage performance.
QoS settings are applied at the LUN level, and any LUN can have its own profile. Besides the usual size, a policy can be configured with:
– minimum guaranteed IOPS
– maximum allowed IOPS
– Burst IOPS
This last parameter is interesting: if a customer buys (as in the screenshot) a LUN with 11000 Max IOPS, but he’s using much less for a long time, it gain “credits” that he can then “spend” to extend the duration of the burst.
In my experience, this is the only existing storage solution able to offer these configurations. Many others have a configurable upper limit, and some sort of priority among LUNs, but none of them has this degree of configuration options. For a service provider, these features allows to protect customers from the so called “noisy neighbors” thanks to Max IOPS, but most of all to “sell” different storage levels based on Min IOPS. If you think about a solution based on VMware vCloud, you can create different storage profiles, configured with different Min IOPS, and sell to customers these levels of storage at different prices. And all of this without the need to create and manage different storage “silos”, but instead gaining economic benefits by using a common large storage.
As of today, QoS granularity is limited to a single LUN, so there is still a problem with Noisy Neighbors among different VMs running in the same LUN. If you are using VMware vCloud with a VSPP agreement, you are probably using Enterprise Plus licenses, so you can leverage SIOC (Storage IO Control) and have it running together with Solidfire QoS.
About other integrations with VMware, this storage right now supports some VAAI primitives, but it does not have for example a proprietary SATP (Storage Array Type Plugin), instead it relies on ESXi native Round Robin.
About SRM, as of today is not supported. Solidfire is planning to introduce asyncronous replication, so this will be the base for SRM support.
Management? API!
The base of any management option they offer is their API interface. It’s a good choice if you keep in mind the ideal customer is a service provider: they want to automate whatever they can, and when the number of managed objects start to be considerable, they cannot afford to waste hours and hours to interact with a GUI. API is a much better solution for automation. On top of the APIs, they “eat their own dog food”, ans so both the available web interface and the vCenter plugin are connecting to the only “real” interface, that is its RESTful API.
Final notes
Solidfire is a really interesting storage solution, highly focused on service providers, offering them guaranteed and measurable (and billable!) performances. They already have several customers, some of them publicly listed on their website, and also some enterprise customers as well.
For the future, some limits like QoS applied only at the LUN level coud be removed, and they will have a higher granularity. This thanks mainly to VMware and its vVols technology, when it will be available. With vVols, you will be able to manage single VMs even on block storages rather than LUN, just like you can already do with NFS. Solidfire has published a demo video, showing the integration of its storage with vVOLs. have you spotted ESXi 6.0 in the video? 🙂