Space Cloud automatically scales your services to match on the load on the system. Each service is scaled independently of each other.
Currently, Autoscaling works for HTTP based workloads only
How it works
The following parameters are required for autoscaling to work.
- Min: The minimum number of instances for the service. This value can be zero.
- Max: The maximum number of instances for the service.
- Concurrency: The desired number of requests per second indented for this service
Some points to remember:
- Autoscaling works for each service independently. This feature makes sure that each service can scale independent of the other, keeping the system highly reactive.
- The number of instances gets calculated at a global level. Space Clouds sums up the number of requests per second for each instance of each service and uses that to decide the desired number of replicas.
- Scaling down to zero works for HTTP workloads only.
The autoscaling feature is under active development. Currently, it contains the following limitations:
- It only works for HTTP workloads. Workloads with a port described as TCP do not get autoscaled. For such cases, the
min replica count is considered as the desired scale.
- Scaling can only happen, based on the number of requests per second.
Have a technical question?
Improve the docs!