Usually, one of the biggest problems when designing a backup infrastructure with Veeam has always been the right sizing of the storage for the backup files.
The compromises of a Veeam backup storage
Before the upcoming version 7, a typical design usually was made with a single backup storage. This storage however had to have a good balance (so, at the end, a compromise) between speed and size, so to be able to keep the desired retention. I often saw customers choosing a NAS or a physical server with local storage and use it as a repository; this because price and capacity of the storage were the most important keys of the design. And obviously, sacrificing speed.
Unfortunately, a Veeam repository has some unique requirements that cannot be easily satisfied with a simple backup storage:
– backups are not 100% sequential writes, since files are compressed and deduplicated, so there is always a certain percentage of read activities on the storage, and some of these are totally random: blocks already written on the storage are compared with the new ones during the deduplication activity and most of these blocks are in different areas of the storage itself. This is even more evident using the Reverse Incremental mode, as I already explained in a previous white paper I wrote.
– restores has the same random access patterns described before for backups
– Instant VM recovery is the peak of these “problems”. In order to run a VM directly from a deduplicated and compressed storage, this storage needs to be really fast. If you think about it, even on production storage real time compression and deduplication is not so used, so you can understand why this could be a real problem on the storare used for Veeam backups
Ultimately, what was happening previously was a beforehand choice between speed and restore points, since a storage beeing fast and large at the same time was too expensive. Paraphrasing a known motto: “Large, Fast, Cheap: You Can Only Pick Two!”
BackupCopy, and a new design for backups
With Veeam Backup & Replication 7, a single feature is going to change completely this way of designing backup architectures: BackupCopy.
If you want additional informations about this feature, you can read a post I wrote, or the official announcement from Veeam. Veeam has placed a huge emphasis on the WAN Acceleration, but this is going to be an “add-on” to BackupCopy itself, that is still the foundation layer of the new solution. In order to understand how it works, here is a quick design, please ignore for now WAN Acceleration and put you focus on BackupCopy:
Thanks to the fact you can now copy the content of a primary backup into a secondary backup, without the need to run two backups against the production storage, and not even by duplicating the same backup files from the primary backup, there are now completely new scenarios to be considered (and to put in practice!).
First of all, from a design viewpoint, we will have now a two-layer layout with two storages, a primary and a secondary one. These two layers will have completely different characteristics. First, their position in the existing infrastructure: primary storage will be responsible for receiving backups from the production storage, while the secondary storage will receive backup copies from the primary backup storage using the BackupCopy feature.
Primary storage is mainly designed to satisfy strict RPO and RTO values. Even by sacrificing its capacity. It’s going to be a really fast storage (also by using SSDs if necessary) in order to guarantee quick backups and restores, included the Instant VM Recovery activities; it’s going to be small in order to reduce its price, and able to keep only few restore points. Statistically, the vast majority of the restores are done from the last restore points, so a storage able to keep 2-3 restore points will be more than enough. And if further space needs will arise, you can use it only for the most important backups, and save the others in an additional repository, slower but cheaper.
Secondary storage is designed only based on its capacity and price per GB. Using a dedicated network for the communication between primary and secondary storage, you can run the BackupCopy Jobs during the day, reading data usually saved during the previous night into the primary storage. The design for this kind of storage can vary: a large NAS, an Object Storage, or a Dedup Appliance; those have proved all their limits while used as a primary storage for Veeam (I talked about it here for example), but they now can be an almost perfect target for this scenario.
Another alternative, if you think older backups are going to be accessed seldom if ever, could also be to use tape libraries as secondary storage, thanks to Veeam support for tapes coming right in version 7.
Backup tiering is the solution
During some of the speeches I had lately about Data Protecton, I talked about Backup Tiering, and was also the topic of my submission for VMworld 2013, but my session was not selected (you can read the abstract I submitted).
The idea of Backup Tiering is about the use of different backup levels working together for a complete data protection, by combining the pros of each layer and canceling out each other lack. Veeam has introduced these same concepts with its new version 7, thus affirming monolithic backups are not for virtualized environments as they were not even for the phisical world years ago.
If you are about to start a new data protection design bases on Veeam Backup or you are going to refresh an existing one, even if you are going to use the 6.5 version still for some months, it’s better to start desigining the new infrastructure in preparation for the new version 7, remembering how to use these new concepts at their best.