In a previous post, I talked about the evolution of the Flash memory market, and how some software solutions are starting to change the way we consume storage. Whenever a new hardware technology comes into market, the previous ones becomes of general use (think SSDs), but the software has always the advantage to leverage any improvement in the underlying hardware, and often re-invent itself. Lately, the common idea in at least two solutions I’ve seen, is the new storage tier they are offering us to use: your servers’ memory.
This time, software solutions are not using a new component, but one that is inside your servers since years: the memory. What different than before? Two things mainly: the first is memory is becoming a cheap component. Not as cheap as disks obsiously, but its price per I/O is incredible. Nothing is fast as memory, not even Flash DIMMs or Flash PCIe device. Also, thanks to the latest CPU technologies, software storage solutions can now optimize the usage of memory to make it even cheaper: compression and deduplication have an impact on performances if they are applied to slow storage devices like spinning disks, but memory is so fast that any optimization can be done in real time with almost no impact. The final result will be probably slower than an unmanaged use of memory alone, but still faster than flash memories.
I know what are you thinking, and you are right. The volatile nature of memory makes it a risk when used as a storage tier: once we power down a system, memory content is lost and no data survive inside it. Sure volatile memory cannot be a primary storage tier, and it’s not the use case not even in these new solutions; What’s new however, is those two solutions are starting to use it as a write cache, and not only as a read cache. Those solution like PernixData FVP (announced memory support in the next release) and Atlantis USX (I talked about it in a previous post), and they are promising a software solution to safely use memory for write cache.
In this situation, even if only for a small fraction of time, data are stored only on memory before being flushed into a lower tier (SSD or spinning disks), so there should be a solid technology for protecting its content until the flush happens. Both solutions do so by creating redundant and protected copies of data stored in it. Memory is fast, also network latency is really low, so copying in a synchronous way data between at least two servers has a really small impact on performances.
And there is another advantage on using memory as caching: new servers are often equipped with a good amount of memory. It’s common to see machines with 128 GB of memory or even more. Not all of it is usually consumed by the hypervisor or the virtual machines. If you can reserve a portion of it to the caching solution, you can then apply compression and deduplication and quickly obtain a alrge pool of cache. I don’t have details about the space savings, and as usual I think it all depends on the type of content you have in the running VMs, but say you have a server with 128 GB of memory, and you use 24 of them for memory caching. Still, 104 GB can be used for running virtual machines. And even if space optimization is around 4 times, you immediately have 96 GB of caching space, almost the same amount of the memory used by the hypervisor.
Also, there is an additional advantage: with SSD caching, you need to design carefully your server solutions. Blade servers for example, or also some 1U rack servers, do not have many bays for installing disks (I wrote few months ago a blog post about this problem arising with server-side SAN solutions). But almost all modern servers have many memory slots available, even blade servers; in fact, almost all of them can go up to 256 GB, and more and more can now install 512 GB or even 1 TB of memory. This amount of storage can be directly used by the caching solution regardless of the server form factor.
Think also a POC (Proof of concept) scenario: with SSDs, you need to plan with the customer the installation of some SSDs into the servers before testing out the caching software. With memory, you can use a small amount of it to create the caching tier without even touching the server. Only when the POC is completed, you can eventually decide to order additional memory (if needed) and plan for its installation.
Probably, memory will never become a pure storage tier, where data are stored ONLY onto it without an additional copy in another (non volatile) storage tier. Using memory as storage is surely fascinating, but also really scaring. But, especially Atlantis (PernixData just announced their new solutions) has proven in the previous years that a well designed solution can offer a reliable solution, even by managing a volatile tier like memory.
We’ll see in the upcoming months and years if the new fastest tier will be memory, while we wait for non-volatile pseudo memory solutions like ultradimms or the (unicorn) HP memristors. For sure, these software solutions will be able to adapt themselves once again…
2 thoughts on “Your next storage tier? Your volatile memory”
I have very serious qualms about using volatile memory as a true storage tier that does anything other than write-through caching. Any time information is only stored on volatile memory, there’s a chance for it to get lost if the machine powers down, and if caching changes on memory in a write-back scenario is commonplace, then an unplanned reboot /will/ lose data.
The possibility exists to mirror the data to multiple systems, all of which are doing ram caching, but then it’s just a numbers game. What failure mode does the machine/rack/datacenter have to hit before you lose data, and how important is that data?
I do agree with your last paragraph, though. The next generation of nv ram is exciting. I’m looking forward to seeing what companies can do with memristors and the like!
Hi Luca – I enjoyed reading your post. At Infinio, we agree with a lot of your points.
We built our server-side caching layer exclusively on RAM for many of the reasons you mention in your post – it’s “wicked” fast (as we say in Boston 🙂 ), even over a network. It’s already idle in lots of servers, and this helps make for a super-easy POC and production deployment.
Our cache is optimized for RAM, in fact, by being globally deduplicated which mean it’s *very* efficient with even small amounts of RAM, giving customers SSD-like performance.
I look forward to your future posts about the direction of the industry. It’s an exciting time to see all the innovations around memory these days, like in-memory database processing and how it’s being used in cloud apps.
Comments are closed.