Space Cloud automatically scales your services to match on the load on the system. Each service is scaled independently of each other.

Service Autoscaling

Currently, Autoscaling works for HTTP based workloads only

How it works

The following parameters are required for autoscaling to work.

  • Min: The minimum number of instances for the service. This value can be zero.
  • Max: The maximum number of instances for the service.
  • Concurrency: The desired number of requests per second indented for this service

Some points to remember:

  • Autoscaling works for each service independently. This feature makes sure that each service can scale independent of the other, keeping the system highly reactive.
  • The number of instances gets calculated at a global level. Space Clouds sums up the number of requests per second for each instance of each service and uses that to decide the desired number of replicas.
  • Scaling down to zero works for HTTP workloads only.


The autoscaling feature is under active development. Currently, it contains the following limitations:

  • It only works for HTTP workloads. Workloads with a port described as TCP do not get autoscaled. For such cases, the min replica count is considered as the desired scale.
  • Scaling can only happen, based on the number of requests per second.

Have a technical question?

Improve the docs!