We all heard about scalability and that it is important, but what exactly is scalability and how do we apply it to our development architecture when we are building an application? There is a lot to tell when it comes to scalability. Maybe you know already a lot about it but maybe you don’t, so let’s start with the most generic explanation of scalability and dive a little deeper into it from there. The definition that I came across with and which I think covers most of it is this:
The ability of something (a system) to adapt to increased demand.
Pure and simple, this means that your system should be able to handle a certain amount of traffic, and when that amount of traffic grows, your system should respond to that alteration by adapting to it instead of breaking down.
In this article we cannot cover all the aspects of scalability so we will stick to vertical and horizontal scaling. We will clarify the difference between vertical and horizontal scaling and give some examples of both the methods.
Vertical versus horizontal scaling
Some time ago I heard a nice example that visualizes the difference between the two of them very clearly.
Imagine you go to a party and 20 other friends want to join you. Five people can go into your family sedan, but in order to take the rest you would have to buy a bigger car. With 20 people you would have to buy a bus to make them fit. This is what we mean with vertical scaling. When your service receives more and more traffic and your servers cannot handle this anymore, you can upgrade your server with more RAM or CPU, but where does it end?
Now imagine you divide your 20 friends over 4 cars with each 5 people. This is what we mean with horizontal scaling. This is far more flexible because as traffic grows you can add more servers next to each other and spread the traffic over the servers by loadbalancers.
Scaling up or scaling out?
When we see vertical scaling, we are talking about scaling up. When we see horizontal scaling, we talk about scaling out. This scaling out means that we add more machine into our pool of resources. Scaling up (vertical scaling) we would do by adding more power (CPU, RAM) to our existing machine. When we are talking in terms of databases, horizontal scaling is based on partitioning of the data. Each node only contains part of the data. When we scale vertically, the data resides on a single node and scaling can be done through multi-core which means spreading the load between the CPU and RAM resources of the machine.
Below an illustration of vertical and horizontal scaling:
Imagine you have an appserver and a DB server both running on the same machine.
As traffic grows you can only make it bigger by adding more RAM or CPU. The disadvantage of this construction is furthermore that when the Appserver is down also your DB server will go down as they are dependent on each other.
The load balancer distributes the workload over the machines.
In the picture you see multiple servers in a cluster. If one of the servers fail, the other one takes over. But what if the loadbalancer fails? There are still too much dependencies in this construction.
Load balanced app server cluster
If one loadbalancer fails, the other one can take over and spread the requests over the servers. If one server fails, there are still two servers left to handle the requests.
These are the basic principles of scalability. Setting up a scalable architecture will cost time and money, but in the end it will be worthwhile. It will save you a lot of headache as your app grows along the way. But, if you don’t want to worry about scalability at all, you can also use Jexia and leave all the headache to us.