Scalability is the capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged in order to accommodate that growth.
Looking at scalability is very relevant in Cloud environment, which provide a high level of on demand elasticity and thus allow us to easily implement the scalability patterns most relevant to our applications to cater for our particular business needs, whichever they may be.
App Service Plan Provisioning
Let first have a look at what happen when you provision a new App Service Plan in Azure.
When you request the creation of an Azure Web App onto a new App Service Plan (aka. Serverfarm), Azure is not provisioning a brand new VM from scratch. Instead, I believe, they assign you a recycled VM of the chosen size, for example B1, from a pool of existing VM.
Indeed you can see that by looking at the uptime of the provisioned VM instance (use Kudu advanced tools): uptime is quite random, and not near 0. A near 0 uptime would indicate a recently created VM. So, in my understanding, Azure keeps a pool of VMs just ready for when a new customer is requesting a new service plan, or when a scaling operation is needed.
That make a lot of sense. Provisioning a new VM each time would take quite a lot of time, while a new Web App deployment actually happens in about 15 to 20 seconds. For that purpose, I assume they will always keep a certain number of VMs ready in the pool, enough to respond to customer’s needs at all time in a timely manner.
Horizontal scalability (Scale out / Scale in)
Horizontal scalability is when you change the number of VM instances supporting your App Service Plan. When load on your apps grows, you will likely scale-out (add more VM instances). When the loads decrease, you will scale back in, reducing the number of VM instances.
As I understand it, when scaling-out, an operation similar to the provisioning will happen: Azure will assign you one new VM of the same size, from its pool of available VMs:
When the plan scales back in again, the VM is removed from the Plan and placed back in the correspondin pool. As I understand it, it is recycled so it can be reused again, possibly by another customer.
Vertical scalability (Scale up/down)
Vertical scalability is about changing the App Service Plan instance size. If you need more compute power to run you apps, you can choose to scale-up your plan to bigger VM(s). What happen in that case though?
AFAIK, Azure runs on Hyper-V virtualization platform, and at the time of this writing, Hyper-V doesn’t allow for dynamic CPU/RAM resizing of VMs. So how do they do it?
Well, I did the test on a Plan with a single small instance. I noticed the instance hostname and uptime before scaling up. I then scaled the plan up to medium, and guess what: the hostname and uptime of the only instance in the plan was totally different!
Here’s my educated guess: Azure assigned a medium VM from the pool of medium VM and added it to the Plan (1). All the Apps running in the plan started to run in that new VM as well. Only then, the small VM got removed from the Plan (2), leaving the plan with a single Medium VM.
We can also easily figure how it would happen with more than one VM. Scaling back in would also happen similarly.