What Are Compute Classes?

GKE Compute Classes are a Kubernetes Custom Resource (CRD) that define sets of node attributes and autoscaling settings. They provide a declarative way to configure infrastructure options for workloads, allowing GKE cluster autoscaling to create nodes with specific characteristics based on workload requirements.

A ComputeClass is essentially a Kubernetes API object that specifies:

  • Priority-based node pool selection: Define multiple node pools in order of preference
  • Fallback behavior: When preferred resources are unavailable, GKE falls back to the next priority
  • Active migration: Automatically migrate workloads to higher-priority nodes when they become available

How Compute Classes Work

When a Pod selects a ComputeClass (via nodeSelector with cloud.google.com/compute-class), GKE:

  1. Looks at the ComputeClass’s priority list
  2. Attempts to schedule the Pod on the highest-priority node pool
  3. Falls back to lower-priority pools if the preferred ones are unavailable
  4. With activeMigration.optimizeRulePriority: true, GKE will eventually migrate Pods to higher-priority nodes when they become available

Implementation in Your citadel-dev Cluster

Your cluster citadel-2g-dev-tokyo-01 has three custom ComputeClasses defined:

1. citadel-default-cc (Default/Shared Workloads)

ComputeClass Definition (<ref_snippet file=β€œ/home/ubuntu/repos/microservices-kubernetes/manifests/microservices-platform/mercari-citadel-jp/development/citadel-2g-dev-tokyo-01/ComputeClass/citadel-default-cc.yaml” lines=β€œ1-14” />):

apiVersion: cloud.google.com/v1
kind: ComputeClass
metadata:
  name: citadel-default-cc
spec:
  activeMigration:
    optimizeRulePriority: true
  priorities:
    - nodepools: ["s-t2d-32-cc-01-v1", "s-t2d-32-cc-01-cidr2-v1"]  # t2d spot (cheapest)
    - nodepools: ["s-t2d-32-ondemand-cc-01-v1"]                     # t2d on-demand (fallback)
    - nodepools: ["s-n2d-32-cc-01-v1"]                              # n2d spot (last resort)

Associated Node Pools (from <ref_snippet file=β€œ/home/ubuntu/repos/microservices-terraform/terraform/microservices-platform/development/cluster-citadel-2g/regions/tokyo/cluster/terragrunt.hcl” lines=β€œ564-608” />):

Node PoolMachine TypeAvailabilityMax Nodes
s-t2d-32-cc-01-v1t2d-standard-32Spot200
s-t2d-32-cc-01-cidr2-v1t2d-standard-32Spot200
s-t2d-32-ondemand-cc-01-v1t2d-standard-32On-demand200
s-n2d-32-cc-01-v1n2d-standard-32Spot200

Each node pool has:

  • Label: cloud.google.com/compute-class: citadel-default-cc
  • Taint: cloud.google.com/compute-class=citadel-default-cc:NoSchedule

2. citadel-mercari-api-cc (Mercari API Workloads)

ComputeClass Definition (<ref_snippet file=β€œ/home/ubuntu/repos/microservices-kubernetes/manifests/microservices-platform/mercari-citadel-jp/development/citadel-2g-dev-tokyo-01/ComputeClass/citadel-mercari-api-cc.yaml” lines=β€œ1-14” />):

apiVersion: cloud.google.com/v1
kind: ComputeClass
metadata:
  name: citadel-mercari-api-cc
spec:
  activeMigration:
    optimizeRulePriority: true
  priorities:
    - nodepools: ["d-mercari-api-t2d-16-cc-02", "d-mercari-api-t2d-16-cc-02-cidr2"]  # t2d-16 spot
    - nodepools: ["d-mercari-api-t2d-32-ondemand-cc-01"]                              # t2d-32 on-demand
    - nodepools: ["d-mercari-api-n2d-32-cc-01"]                                       # n2d-32 spot

Associated Node Pools (from <ref_snippet file=β€œ/home/ubuntu/repos/microservices-terraform/terraform/microservices-platform/development/cluster-citadel-2g/regions/tokyo/cluster/terragrunt.hcl” lines=β€œ1751-1979” />):

Node PoolMachine TypeAvailabilityMax NodesSpecial Features
d-mercari-api-t2d-16-cc-02t2d-standard-16Spot51DB access tags
d-mercari-api-t2d-16-cc-02-cidr2t2d-standard-16Spot51Secondary CIDR
d-mercari-api-t2d-32-ondemand-cc-01t2d-standard-32On-demand51DB access tags
d-mercari-api-n2d-32-cc-01n2d-standard-32Spot51DB access tags

These dedicated node pools have additional taints for node-pool-id=d-mercari-api and machine-series=t2d.


3. citadel-mercari-eaas-cc (Elasticsearch-as-a-Service Workloads)

ComputeClass Definition (<ref_snippet file=β€œ/home/ubuntu/repos/microservices-kubernetes/manifests/microservices-platform/mercari-citadel-jp/development/citadel-2g-dev-tokyo-01/ComputeClass/citadel-mercari-eaas-cc.yaml” lines=β€œ1-14” />):

apiVersion: cloud.google.com/v1
kind: ComputeClass
metadata:
  name: citadel-mercari-eaas-cc
spec:
  activeMigration:
    optimizeRulePriority: true
  priorities:
    - nodepools: ["d-mercari-eaas-t2d-32-cc-01"]           # t2d-32 spot
    - nodepools: ["d-mercari-eaas-t2d-32-ondemand-cc-01"]  # t2d-32 on-demand
    - nodepools: ["d-mercari-eaas-n2d-32-cc-01"]           # n2d-64 spot

Associated Node Pools (from <ref_snippet file=β€œ/home/ubuntu/repos/microservices-terraform/terraform/microservices-platform/development/cluster-citadel-2g/regions/tokyo/cluster/terragrunt.hcl” lines=β€œ1981-2116” />):

Node PoolMachine TypeAvailabilityMax NodesZones
d-mercari-eaas-t2d-32-cc-01t2d-standard-32Spot100Multi-zone (a,b,c)
d-mercari-eaas-t2d-32-ondemand-cc-01t2d-standard-32On-demand50Multi-zone (a,b,c)
d-mercari-eaas-n2d-32-cc-01n2d-standard-64Spot50Multi-zone (a,b,c)

How Workloads Use Compute Classes

Method 1: Explicit Node Selector

Workloads can explicitly select a compute class in their Pod spec:

spec:
  nodeSelector:
    cloud.google.com/compute-class: citadel-mercari-eaas-cc
  tolerations:
    - key: "cloud.google.com/compute-class"
      operator: "Equal"
      value: "citadel-mercari-eaas-cc"
      effect: "NoSchedule"

Example: Elasticsearch workloads (<ref_snippet file=β€œ/home/ubuntu/repos/microservices-kubernetes/.hydrated/microservices/mercari-eaas-jp/development/citadel-2g-dev-tokyo-01/elasticsearch-product-search/manifest.yaml” lines=β€œ594-602” />)


Method 2: Automatic Assignment via KubeMod

Your cluster uses KubeMod ModRules to automatically inject compute class selectors and tolerations into Pods that don’t explicitly specify them. This is a key part of the implementation.

ModRules for citadel-default-cc (<ref_file file=β€œ/home/ubuntu/repos/microservices-kubernetes/manifests/microservices-platform/kubemod-system/development/citadel-2g-dev-tokyo-01/ModRule” />):

ModRuleConditionAction
citadel-default-cc-no-affinityPod has no affinity AND no nodeSelectorAdd citadel-default-cc selector + toleration
citadel-default-cc-no-node-affinityPod has affinity but no nodeAffinity AND no nodeSelectorAdd citadel-default-cc selector + toleration
citadel-default-cc-no-required-node-affinityPod has nodeAffinity but no required rules AND no nodeSelectorAdd citadel-default-cc selector + toleration
citadel-default-cc-has-required-butler-affinityPod has only availability affinity (spot/ondemand)Add citadel-default-cc selector + toleration

ModRules for citadel-mercari-api-cc:

ModRuleConditionAction
citadel-mercari-api-cc-has-nodeaffinityPod has nodeAffinity for d-mercari-apiAdd citadel-mercari-api-cc selector + toleration
citadel-mercari-api-cc-has-nodeselectorPod has nodeSelector for d-mercari-apiAdd citadel-mercari-api-cc selector + toleration

Cost Optimization Strategy

The compute class priority system implements a cost optimization strategy:

  1. First Priority (Cheapest): Spot VMs with T2D machine series
  2. Second Priority (Fallback): On-demand VMs (for when Spot is unavailable)
  3. Third Priority (Last Resort): N2D machine series (different availability pool)

With activeMigration.optimizeRulePriority: true, when Spot VMs become available again, GKE will automatically migrate workloads back to the cheaper option.


Architecture Summary

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚                    citadel-2g-dev-tokyo-01                  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                β”‚
           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
           β”‚                                    β”‚                                    β”‚
           β–Ό                                    β–Ό                                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  citadel-default-cc β”‚            β”‚citadel-mercari-api-ccβ”‚           β”‚citadel-mercari-eaas-ccβ”‚
β”‚   (Shared Pools)    β”‚            β”‚  (Dedicated API)    β”‚            β”‚   (Elasticsearch)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                                    β”‚                                    β”‚
         β–Ό                                    β–Ό                                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Priority 1: t2d Spotβ”‚            β”‚ Priority 1: t2d-16  β”‚            β”‚ Priority 1: t2d-32  β”‚
β”‚ Priority 2: t2d OD  β”‚            β”‚ Priority 2: t2d-32  β”‚            β”‚ Priority 2: t2d-32  β”‚
β”‚ Priority 3: n2d Spotβ”‚            β”‚ Priority 3: n2d-32  β”‚            β”‚ Priority 3: n2d-64  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Files Reference

ComponentLocation
Node Pool Definitions<ref_file file=β€œ/home/ubuntu/repos/microservices-terraform/terraform/microservices-platform/development/cluster-citadel-2g/regions/tokyo/cluster/terragrunt.hcl” />
ComputeClass CRDs<ref_file file=β€œ/home/ubuntu/repos/microservices-kubernetes/manifests/microservices-platform/mercari-citadel-jp/development/citadel-2g-dev-tokyo-01/ComputeClass” />
KubeMod ModRules<ref_file file=β€œ/home/ubuntu/repos/microservices-kubernetes/manifests/microservices-platform/kubemod-system/development/citadel-2g-dev-tokyo-01/ModRule” />
Elasticsearch Compute Class Config<ref_snippet file=β€œ/home/ubuntu/repos/microservices-kubernetes/kit/pkg/elasticsearch/elasticsearch-spec.cue” lines=β€œ59-61” />

The microservices-ci repo does not contain any compute class specific configurations - the CI/CD pipelines apply the Terraform and Kubernetes manifests that contain the compute class definitions.