Overrides

The Stackable operators configure the Stacklets they are operating with sensible defaults and required settings to enable connectivity and security. Other important settings are usually exposed in the Stacklet resource definition for you to configure directly. In some cases however, you might want to configure certain settings that are not exposed or are set by the operator.

The resource definitions of all Stacklets support overrides, specifically for the product configuration, environment variables, and the PodSpec the operators generate.

Overriding certain configuration properties can lead to faulty clusters. Overrides should only be used as a last resort!

The Stacklet definitions support overriding configuration aspects either per role or role group, where the more specific override (role group) has precedence over the less specific one (role).

Config overrides

For a role or role group, at the same level of config, you can specify configOverrides for any of the configuration files the product uses.

An example for an HDFS cluster looks as follows:

apiVersion: hdfs.stackable.tech/v1alpha1
kind: HdfsCluster
metadata:
  name: simple-hdfs
spec:
  nameNodes:
    config:
      ...
    configOverrides: (1)
      core-site.xml: (2)
        fs.trash.interval: "5" (3)
    roleGroups:
      default:
        config:
          ...
        configOverrides: (4)
          hdfs-site.xml:
            dfs.namenode.num.checkpoints.retained: "3"
        replicas: 1
  ...
1 configOverride on role level.
2 The name of the configuration file that the override should be put into. You can find the available files as keys in the ConfigMaps created by the operator.
3 All keys must be strings, even if they are integers like in the example here.
4 configOverride on role group level, takes precedence over overrides at role level.

The roles, as well as the configuration file and configuration settings available depend on the specific product. All override property values must be strings. The properties will be formatted and escaped correctly into the file format used by the product, which is again product specific. This can be a .properities file, XML, YAML or JSON.

You can also set the property to an empty string (my.property: ""), which effectively disables the property the operator would write out normally. In case of a .properties file, this will show up as my.property= in the .properties file.

Environment variable overrides

For a role or role group, at the same level of config, you can specify envOverrides for any environment variable.

An example for an HDFS cluster looks as follows:

apiVersion: hdfs.stackable.tech/v1alpha1
kind: HdfsCluster
metadata:
  name: simple-hdfs
spec:
  nameNodes:
    config:
      ...
    envOverrides: (1)
      MY_ENV_VAR: "MY_VALUE"
    roleGroups:
      default:
        config:
          ...
        envOverrides: (2)
          MY_ENV_VAR: "MY_VALUE"
        replicas: 1
  ...
1 envOverrides at role level.
2 envOverrides on role group level, takes precedence over the overrides specified at role level.

You can set any environment variable, but every Stacklet supports a different set of environment variables, these are documented in the product documentation. All override property values must be strings.

Pod overrides

For a role or role group, at the same level of config, you can specify podOverrides for any of the attributes you can configure on a Pod. Every Stacklet contains one or more StatefulSets, DaemonSets, or Deployments, which in turn contain a Pod template that is used by Kubernetes to create the Pods that make up the Stacklet. The podOverrides allow you to specify a fragment of a Pod template that is then overlayed over the one created by the operator.

An example for an HDFS cluster looks as follows:

apiVersion: hdfs.stackable.tech/v1alpha1
kind: HdfsCluster
metadata:
  name: simple-hdfs
spec:
  nameNodes:
    config:
      ...
    podOverrides: (1)
      spec:
        tolerations:
          - key: "key1"
            operator: "Equal"
            value: "value1"
            effect: "NoSchedule"
    roleGroups:
      default:
        config:
          ...
        podOverrides: (2)
          metadata:
            labels:
              my-custom-label: super-important-label
        replicas: 1
  ...
1 podOverrides at role level.
2 podOverrides on role group level, takes precedence over the overrides specified at role level.

The podOverrides can be any valid PodTemplateSpec (which means every property that you can set on a regular Kubernetes Pod).

The priority of how to construct the final Pod submitted to Kubernetes looks as follows (low to high):

  1. PodTemplateSpec calculated by operator

  2. PodTemplateSpec given in role level podOverrides

  3. PodTemplateSpec given in rolegroup level podOverrides

Each of these are combined top to bottom using a deep merge. The exact merge algorithm is described in the k8s-openapi docs, which basically tries to mimic the way Kubernetes merges patches onto objects.

The podOverrides will be merged onto the following resources the operators deploy:

  • StatefulSets containing the products (most of the products)

  • DaemonSets containing the products (currently only OPA)

  • Deployments containing the products (currently no product, but there might be Deployments in the future)

They will not be applied to:

  • Jobs, that are used to setup systems the product depends on e.g. create a database schema for Superset or Airflow.

Object overrides

Sometime you need to override Kubernetes objects other than the generated Pods, e.g. ServiceAccounts or the StatefulSet/Deployment/DaemonSet. Object overrides allow you to override any Kubernetes object the operator creates are part of it’s reconciliation loop, which basically means all objects that are created for a given stacklet.

On every Stackable CRD that get’s reconciled into a set of Kubernetes objects we offer a field .spec.objectOverrides. You can set it to a list of arbitrary YAML objects, which need to be valid Kubernetes objects.

For every object it creates, the operator will walk the list of overrides from top to bottom. It will than first check if the override matches the object being created using apiVersion, kind, name and namespace (if set - doesn’t need to be the case for cluster scoped objects). If the override matches, it will be merged in the same way Pod overrides are merged, using the merge algorithm described in the k8s-openapi docs, which basically tries to mimic the way Kubernetes merges patches onto objects.

A consequence of this is, that you can only modify objects the operator creates, and not deploy any additional arbitrary objects.

apiVersion: zookeeper.stackable.tech/v1alpha1
kind: ZookeeperCluster
metadata:
  name: simple-zk
spec:
  # ...
  objectOverrides:
    - apiVersion: apps/v1
      kind: StatefulSet
      metadata:
        name: simple-zk-server-default
        namespace: default
        labels:
          custom: label
      spec:
        replicas: 2
        podManagementPolicy: Parallel
    - apiVersion: v1
      kind: ServiceAccount
      metadata:
        name: simple-zk-serviceaccount
        namespace: default
        labels:
          im-on: AWS
        annotations:
          custom: AWS
    - apiVersion: policy/v1
      kind: PodDisruptionBudget
      metadata:
        name: simple-zk-server
        namespace: default
      spec:
        maxUnavailable: 42

JVM argument overrides

You can configure the JVM arguments used by JVM based tools. This is often needed to e.g. configure a HTTP proxy or other network settings.

As with other overrides, the operator generates a set of JVM arguments that are needed to run the tool. You can specify additional arguments which are merged on top of the ones the operator generated. As some JVM arguments are mutually exclusive (think of -Xmx123m and -Xmx456m), you also have the option to remove JVM arguments - either by specifying the exact argument or a regex.

The merging mechanism is applied <operator generated> ← <role user specified> ← <rolegroup user specified> and works as follows:

  1. All arguments listed in user specified remove are removed from operator generated

  2. All arguments matching any regex from user removeRegex are removed from operator generated. The regex needs to match the entire argument, not only a substring

  3. All arguments from user specified add are added to operator

You can check the resulting effective JVM arguments by looking at the ConfigMap containing the config for the roleGroup (although some tools read the JVM arguments from environmental variables).

Simple example

One simple usage of this functionality is to add some JVM arguments, in this case needed for a special network setup:

kind: NifiCluster
spec:
  # ...
  nodes:
    jvmArgumentOverrides:
      add: # Add some networking arguments
        - -Dhttps.proxyHost=proxy.my.corp
        - -Dhttps.proxyPort=8080
        - -Djava.net.preferIPv4Stack=true

Advanced example

The following more advanced setups shows how the garbage collector can be changed, the JVM memory configs can be changed and how roleGroups can override roles.

kind: NifiCluster
spec:
  # ...
  nodes:
    config:
      resources:
        memory:
          limit: 42Gi # We define some memory config, so that we can override it further down
    jvmArgumentOverrides:
      remove:
        - -XX:+UseG1GC # Remove argument generated by operator
      add: # Add some networking arguments
        - -Dhttps.proxyHost=proxy.my.corp
        - -Dhttps.proxyPort=8080
        - -Djava.net.preferIPv4Stack=true
    roleGroups:
      default:
        replicas: 1
        jvmArgumentOverrides:
          # We need more memory!
          removeRegex: # They need to match the entire string, not only a part!
            - -Xmx.* # So this will match "-Xmx123m", but not "-foo-Xmx123m"
            - -Dhttps.proxyPort=.* # Remove arguments from the role, so that we can override it
          add: # After we removed some arguments we can add the correct ones
            - -Xmx40000m
            - -Dhttps.proxyPort=1234 # Override arguments from the role