Netflix runs 100% on AWS now

0 Flares Twitter 0 Facebook 0 LinkedIn 0 Email -- 0 Flares ×

Netflix decided in 2008 that its new business model would have been the complete consumption of public cloud, specifically AWS. It took 7 years to the leader in Video Streaming to complete the migration of its services into AWS, and now Netflix doesn’t run any significant workload in its own premises.

A long journey

In a blog post in their official website, Netflix announced on 11th February 2016 that now every streaming service they offer, is not hosted anymore in their data centers, but on Amazon Web Services. It’s another interesting consequence of this age, where companies offer services without owning the infrastructure: Netflix is probably the leader in video streaming in the world, and doesn’t own a single server to do so; just like Uber is probably the biggest car service in the world, without owning a single car, and so on. The beauty (or the horror) of the era of services.

Back to our topic, it’s interesting to read the post to find out also additional informations: first, cost reduction was not the primary reason for Netflix to migrate to AWS. This alone is a great informations for others looking at the cloud as the solution to all their needs: you do not move to the cloud just to save money. It can happen because you are leveraging their technologies and thus you are improving your own processes, but moving your workloads as they are to the cloud will not save you money.

The two main drivers were the rapid growth they were facing (“we simply could not have racked the servers fast enough”) and the outages they were having in their infrastructure. Moving to AWS solved both these problems, and they were able to have a massive growth. I can just imagine how AWS can be proud of this success story! Anyway, the giant cloud provider has no magic formula to be able to do what others cannot; they use more or less the same components that every data center has: servers with cpu, memory and disks, networking gear, racks and power units, and so on. What’s different to let them do what companies like Netflix cannot? Again, it’s all about the scale (you can read more about my opinions on this on my previous article The war for the public cloud is claiming its victims):

– scale allows to reduce costs, since you can have better deals from suppliers, and even ask for custom components that are 100% designed to fit your own needs
– scale forces you to push automation to the limit, because there’s no other way you can manage such a large environment without it. But in return, automation improves even more your efficiencies, in a virtuous circle
– scale allows you to grow even faster, because automation allows you to rack and stack servers at an insane pace, as they are automatically configured in minutes once they are connected, and the agreements you have in place with ODM (original design manufacturer, the companies creating custom gear for you) says that they will ship you tons of servers whenever needed

Embrace the cloud, completely. Is it good or bad?


Netflix architecture

Reading more details in the blog post, we can learn how Netflix has not just decided to move their workloads to AWS as they were, but they slowly re-designed all of them to leverage the specific capabilities available in AWS; this is the main reason why it took 7 years to complete the move. The message to readers is clear: in order to benefit from the public cloud, you need to embrace it completely. This is certainly true, as by simply moving your own virtual machines to AWS (or any other public cloud, it doesn’t matter) this will not improve your experience and your expenses, as these workloads are not optimized at all. In the article, Netflix explains how they leveraged micro-services, NoSQL, cloud-native apps to built their new platform.

I have no detailed information on how the final service is built, but some interesting considerations can be taken from this news:

– to migrate to a Public Cloud is not an easy task: maybe your size is not as massive as Netflix, and it may take a lot less to do the same, but moving to the public cloud is not one short operation done during a night. It requires a lot of planning, testing and multiple fractional iterations. At least if you want to use the public cloud at its full (read next point);

– you don’t migrate “as is” to the Public Cloud. If your IT model is based on traditional paradigms of monolithic applications and macro services, it makes no sense to migrate them as they are to the public cloud. Yes, you could simply migrate all your virtual machines to AWS after converting them from VMware or Hyper-V to Xen for example, but the final solution would be extremely inefficient, and you will suffer from the same limits that your architecture had at your premises. If your services can be clustered up to only 8 nodes for example, the same limit would follow you in the cloud. If your management component is a single point of failure, it would be still the weakest point once moved to the cloud. And the simple yet powerful reason is that Public Cloud has NOT been designed to be a replacement for on-premises virtualized datacenters. Even if virtual machines are still at the core of the compute layer in Public Cloud, higher efficiencies can only be obtained if you fully leverage the concepts of cloud-native applications: micro-services, containers, devops development.

This brings to an important decision every IT manager has to face: fully embracing the Public Cloud model is the way to go? The more you leverage the powerful yet custom services the cloud gives you, the more your application will be efficient in “that” cloud, but the less it will be easy to move it to another location. If you are using a lot a service that only AWS has, you will have to re-design it before moving it to another provider. How much are you willing to get “locked-in” in the specific technology of the public cloud you choose to use, or how much “acceptable inefficiency” you want to keep (and pay for) to be able eventually to migrate somewhere else? Virtual machines are becoming as of late a “inefficiency element” that is too big, and it’s one of the reasons why containers are gaining a lot of traction. Are containers the solution to leverage the public cloud and still avoid a complete lock-in?

Time will tell, as always.