Deployment types

Development — designed for development use cases.
Production — designed for production workloads and high-availability.
Multi-cluster — designed for demanding production workloads, high-scalability, high-availability, and advanced multi-tenancy configurations.

Development

Available for free, no credit card required. Your free trial is limited to 2 development deployments and only 1,000 queries per day. Upgrade to any paid plan to unlock all features.

Development deployments are designed for development use cases only. This makes it easy to get started with Cube quickly, and also allows you to build and query pre-aggregations on-demand. Development deployments don’t have dedicated refresh workers and, consequently, they do not refresh pre-aggregations on schedule. Development deployments do not provide high-availability nor do they guarantee fast response times. Development deployments also auto-suspend after 30 minutes of inactivity, which can cause the first request after the deployment wakes up to take additional time to process. They also have limits on the maximum number of queries per day and the maximum number of Cube Store Workers. We strongly advise not using a Development deployment in a production environment, it is for testing and learning about Cube only and will not deliver a production-level experience for your users. You can try a Development deployment by signing up for Cube to try it free (no credit card required).

Production

Available on all paid plans. You can also choose a deployment tier.

Production deployments are designed to support high-availability production workloads. It consists of several key components, including starting with 2 Cube API instances, 1 Cube Refresh Worker and 2 Cube Store Routers - all of which run on dedicated infrastructure. The deployment can automatically scale to meet the needs of your workload by adding more components as necessary; check the page on scalability to learn more.

Multi-cluster

Multi-cluster deployments are designed for demanding production workloads, high-scalability, high-availability, and large multi-tenancy configurations, e.g., with more than 100 tenants.

Available on Premium and above plans.

It provides you with two options:

Scale the number of Production deployments serving your workload, allowing to route requests over up to 10 production deployments and up to 100 API instances.
Optionally, scale the number of Cube Store routers, allowing for increased Cube Store querying performance.

High-level architecture diagram of a Multi-cluster deployment

Each production deployment is billed separately, and all production deployments can use auto-scaling to match demand.

Configuring Multi-cluster

To switch your deployment to Multi-cluster, navigate to Settings → General, select it under Type, and confirm with ✓:

To set the number of production deployments within your Multi-cluster deployment, navigate to Settings → Configuration and edit Number of clusters.

Routing traffic between production deployments

Cube routes requests between multiple production deployments within a Multi-cluster deployment based on context_to_app_id. In most cases, it should return an identifier that does not change over time for each tenant. The following implementation will make sure that all requests from a particular tenant are always routed to the same production deployment. This approach ensures that only one production deployment keeps compiled data model cache for each tenant and serves its requests. It allows to reduce the footprint of the compiled data model cache on individual production deployments.

from cube import config
 
@config('context_to_app_id')
def context_to_app_id(ctx: dict) -> str:
  return f"CUBE_APP_{ctx['securityContext']['tenant_id']}"

If your implementation of context_to_app_id returns identifiers that change over time for each tenant, requests from one tenant would likely hit multiple production deployments and you would not have the benefit of reduced memory footprint. Also you might see 502 or timeout errors in case of different deployment nodes would return different context_to_app_id results for the same request.

Switching between deployment types

To switch a deployment’s type, go to the deployment’s Settings screen and select from the available options:

Deployment Settings page showing Development, Production, and Multi-cluster options

Connect to data

Users & Permissions

SSO & Identity Providers

Deployment

Monitoring

AI

Account & Billing

Development

Production

Multi-cluster

Configuring Multi-cluster

Routing traffic between production deployments

Switching between deployment types

Connect to data

Users & Permissions

SSO & Identity Providers

Deployment

Monitoring

AI

Account & Billing

​Development

​Production

​Multi-cluster

​Configuring Multi-cluster

​Routing traffic between production deployments

​Switching between deployment types

Development

Production

Multi-cluster

Configuring Multi-cluster

Routing traffic between production deployments

Switching between deployment types