
VM scale sets
VMSSes are Azure compute resources that you can use to deploy and manage a set of identical VMs. With all VMs configured in the same way, scale sets are designed to support true autoscaling, and no pre-provisioning of VMs is required. It helps in provisioning multiple identical VMs that are connected to each other through a virtual network and subnet.
A VMSS consists of multiple VMs, but they are managed at the VMSS level. All VMs are part of this unit and any changes made are applied to the unit, which, in turn, applies it to those VMs that are using a predetermined algorithm:

Figure 2.12: A VM scale set
This enables these VMs to be load balanced using an Azure load balancer or an application gateway. The VMs could be either Windows or Linux VMs. They can run automated scripts using a PowerShell extension and they can be managed centrally using a state configuration. They can be monitored as a unit, or individually using Log Analytics.
VMSSes can be provisioned from the Azure portal, the Azure CLI, Azure Resource Manager templates, REST APIs, and PowerShell cmdlets. It is possible to invoke REST APIs and the Azure CLI from any platform, environment, or OS, and in any language.
Many of Azure's services already use VMSSes as their underlying architecture. Among them are Azure Batch, Azure Service Fabric, and Azure Container Service. Azure Container Service, in turn, provisions Kubernetes and DC/OS on these VMSSes.
VMSS architecture
VMSSes allow the creation of up to 1,000 VMs in a scale set when using a platform image, and 100 VMs if using a custom image. If the number of VMs is less than 100 in a scale set, they are placed in a single availability set; however, if the number is greater than 100, multiple availability sets are created (known as placement groups), and VMs are distributed among these availability sets. We know from Chapter 1, Getting started with Azure, that VMs in an availability set are placed on separate fault and update domains. Availability sets related to VMSSes have five fault and update domains by default. VMSSes provide a model that holds metadata information for the entire set. Changing this model and applying changes impacts all VM instances. This information includes the maximum and minimum number of VM instances, the OS SKU and version, the current number of VMs, fault and update domains, and more. This is demonstrated in Figure 2.13:

Figure 2.13: VMs in an availability set
VMSS scaling
Scaling refers to increasing or decreasing compute and storage resources. A VMSS is a feature-rich resource that makes scaling easy and efficient. It provides autoscaling, which helps in scaling up or down based on external events and data such as CPU and memory usage. Some of the VMSS scaling features are given here.
Horizontal versus vertical scaling
Scaling can be horizontal or vertical, or both. Horizontal scaling is another name for scaling out and in, while vertical scaling refers to scaling up and down.
Capacity
VMSSes have a capacity property that determines the number of VMs in a scale set. A VMSS can be deployed with zero as a value for this property. It will not create a single VM; however, if you provision a VMSS by providing a number for the capacity property, that number of VMs are created.
Autoscaling
The autoscaling of VMs in a VMSS refers to the addition or removal of VM instances based on the configured environment in order to meet the performance and scalability demands of an application. Generally, in the absence of a VMSS, this is achieved using automation scripts and runbooks.
VMSSes help in this automation process with the support of configuration. Instead of writing scripts, a VMSS can be configured for autoscaling up and down.
Autoscaling uses multiple integrated components to achieve its end goal. Autoscaling entails continuously monitoring VMs and collecting telemetry data about them. This data is stored, combined, and then evaluated against a set of rules to determine whether autoscaling should be triggered. The trigger could be to scale out or scale in. It could also be to scale up or down.
The autoscaling mechanism uses diagnostic logs for collecting telemetry data from VMs. These logs are stored in storage accounts as diagnostic metrics. The autoscaling mechanism also uses the Application Insights monitoring service, which reads these metrics, combines them, and stores them in a storage account.
Background autoscaling jobs run continually to read Application Insights' storage data, evaluate it based on all the rules configured for autoscaling, and, if any of the rules or combination of rules are met, run the process of autoscaling. The rules can take into consideration the metrics from guest VMs and the host server.
The rules defined using the property descriptions are available at https://docs.microsoft.com/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-autoscale-overview.
The VMSS autoscale architecture is shown in Figure 2.14:

Figure 2.14: VMSS autoscale architecture
Autoscaling can be configured for scenarios that are more complex than general metrics available from environments. For example, scaling could be based on any of the following:
A specific day
A recurring schedule such as weekends
Weekdays versus weekends
Holidays and one-off events
Multiple resource metrics
These can be configured using the schedule property of Application Insights resources, which help in registering rules.
Architects should ensure that at least two actions—scale out and scale in—are configured together. Scaling a configuration in or out will not help in achieving the scaling benefits provided by VMSSes.
To summarize, we have covered the scalability options in Azure and the detailed scaling features in the case of IaaS and PaaS to meet your business requirements. If you recall the shared responsibility model, you'll remember that platform upgrades and maintenance should be done by the cloud provider. In this case, Microsoft takes care of upgrades and maintenance related to the platform. Let's see how this is achieved in the next section.