My adventures with Ceph Storage. Part 1: Introduction

74 Flares Twitter 0 Facebook 18 Google+ 10 LinkedIn 46 Email -- 74 Flares ×

Before joining Veeam, I worked in a datacenter completely based on VMware vSphere / vCloud. As I already explained in a previous post service providers ARE NOT large companies Service Providers’ needs are sometimes quite different than those of a large enterprise, and so we ended up using different technologies. One of the last projects I looked at was Ceph. We were searching for a scale-out storage system, able to expand linearly without the need for painful forklift upgrades. The idea of a DIY (do it yourself) storage was not scaring us, since we had the internal IT skills to handle this issue. After leaving, I kept my knowledge up to date and I continued looking and playing with Ceph.

This series of posts is not only focused on Ceph itself, but most of all what you can do with it. Ceph is not (officallly) supported by VMware at the moment, even if there are plans about this in their roadmap, so you cannot use it as a block storage device for your virtual machines, even if we tested it and it was working quite well using an iSCSI linux machine in between. There are however several other use cases, and one is using Ceph as a general purpose storage, where you can drop whatever you have around in your datacenter; in my case, it’s going to be my Veeam Repository for all my backups. At the end of this series, I will show you how to create a scale-out and redundant Veeam Repository using Ceph.

Also available in this series:
Part 2: Architecture for Dummies
Part 3: Design the nodes
Part 4: deploy the nodes in the Lab
Part 5: install Ceph in the lab
Part 6: Mount Ceph as a block device on linux machines
Part 7: Add a node and expand the cluster storage
Part 8: Veeam clustered repository
Part 9: failover scenarios during Veeam backups
Part 10: Upgrade the cluster

 

Why Ceph?

Ceph

Because it’s free and open source, it can be used in every lab, even at home. You only need 3 servers to start; they can be 3 spare servers you have around, 3 computers, or also 3 virtual machines all running in your laptop. Ceph is a great “learning platform” to improve your knowledge about Object Storage and Scale-Out systems in general, even if in your production environments you are going to use something else.

Before starting thou, I’d like to give you some warnings:

– I work for Veeam, and as a data protection solution for virtualized environments, we deal with a large list of storage vendors. We DO NOT prefer any storage solution rather than others. This articles ARE NOT suggesting you this solution rather than commercial systems. As always, it all comes down to your environment and your business needs: you need to analyze requirements, limits, constraints, assumptions, and choose (for yourself or your customer) the best solution. Ceph is “simply” one of the few large-scale storage solutions based on open source software, so it’s easy to study it even in your home lab. Think about it as an educational effort.

– Ceph, as said, is an open source software solution. It requires some linux skills, and if you need commercial support your only option is to get in touch with InkTank, the company behind Ceph, or an integrator, or RedHat since it has been now acquired by them. If you don’t feel at ease with a MAKE solution, look around to BUY a commercial solution (read more about Make or Buy decisions). There are many of them around, and some of them are damn good.

What is Ceph storage

First things first, a super quick introduction about Ceph.

Ceph is an open source distributed storage system, built on top of commodity components, demanding reliability to the software layer.
A buzzword version of its description would be “scale out software defined object storage built on commodity hardware”. Yeah, buzzword bingo!

Ceph was originally designed by Sage Weil during his PhD, and afterwards managed and distributed by InkTank, a company specifically created to offer commercial services for Ceph, and where Sage had the CTO role. Last April 2014, Inktank (and so Ceph) has been acquired by RedHat.

Ceph is scale out: It is designed to have no single point of failure, it can scale to an infinite number of nodes, and nodes are not coupled with each other (shared-nothing architecture), while traditional storage systems have instead some components shared between controllers (cache, disks…). I already explained in a detailed analysis why I think The future of storage is Scale Out, and Ross Turk, one of the Ceph guys, has explained in a short 5 minutes videos these concepts, using an awesome comparison with hotels. Hotels? Right, hotels; have a look at the video:

As you will learn from the video, Ceph is built to organize data automatically using Crush, the algorythm responsible for the intelligent distribution of objects inside the cluster, and then uses the nodes of the cluster as the managers of those data. I’m not going to describe in further details how crush works and which configuration options are available; I’m not a Ceph guru, and my study is aimed at having a small Ceph cluster for my needs. But if you want, you can have Crush to take into accounts and manage fault domains like racks and even entire datacenters, and thus create a geo-cluster that can protect itself even from huge disasters. You can get an idea of what Crush can do for example in this article.

The other pillars are the nodes. Ceph is built using simple servers, each with some amount of local storage, replicating to each other via network connections. There is no shared component between servers, even if some roles like Monitors are created only on some servers, and accessed by all the nodes. Ceph does not use technologies like RAID or Parity, redundancy is guaranteed using replication of the objects, that is any object in the cluster is replicated at least twice in two different places of the cluster. If a node fails, the cluster identifies the blocks that are left with only one copy, and creates a second copy somewhere else in the cluster. Latest versions of Ceph can also use erasure code, saving even more space at the expense of performances (read more on Erasure Coding: the best data protection for scaling-out?).

Ceph can be dynamically expanded or shrinked, by adding or removing nodes to the cluster, and letting the Crush algorythm rebalance objects.

I already said at least twice the term “objects”. Ceph is indeed an object storage. Data are not files in a file system hierarchy, nor are blocks within sectors and tracks. Each object typically includes the data itself, a variable amount of metadata, and a globally unique identifier. Each file entering the cluster is saved in one or more objects (depending on its size), some metadata referring to the objects are created, a unique identifier is assigned, and the object is saved multiple times in the cluster. The process is reversed when data needs to be accessed.

The advantage over file or block storage is mainly in size: the architecture of an object storage can easily scale to massive sizes; in fact, it’s used in those solutions that needs to deal with incredible amounts of objects. To name a few, Dropbox or Facebook are built on top of object storage systems, since it’s the best way to manage those amounts of files.

Other resources

That’s it for now. While you wait for the next chapters, you can use the same resources I used to learn more about Ceph myself:

Ceph official website, and specifically their documentation

The Ceph Blog

The website of Sebastien Han, he’s for sure a Ceph Guru.

74 Flares Twitter 0 Facebook 18 Google+ 10 LinkedIn 46 Email -- 74 Flares ×
  • Alex

    Excelente, muchas gracias por el tutorial. Se nota el esfuerzo, haz hecho que me llame la atención ceph.

    Saludos.

  • Jayshri Nikam

    Very informative…Thanks for your hard work on putting up all these things together 🙂

  • Hassan Almuhana

    Thanks for your wonderful tutorial , its very useful and i was looking for such training and o finally find it in this tutorial .

  • Virusgunz

    hi did you ever do a ceph integration wit openstack ?