Adding Bare Metal Nodes to our Hardware Inventory

As you already know, when using Hosted Control Planes the control plane components will run as pods on the management cluster while the data plane will run on dedicated nodes. In this lab, we are using bare metal nodes as backing infrastructure for our data plane.

This section relies on Assisted Service and Baremetal Operator, these two topics are outside the scope of this lab. You can click the links to learn more on them.

Assisted Service allows us to boot our hardware with a discovery ISO that will introspect our hardware and will add it to our hardware inventory. Later on, when we create our first Hosted Cluster the hardware from this inventory will be used to provision the required data plane nodes.

The object used to get a discovery ISO is called InfraEnv, on top of that we will create a BareMetalHost object that will get us the bare metal node booted with this ISO.

The steps below make use of the CLI, you can also use the Web Console to achieve the same result. You can create a new InfraEnv from MCE under InfrastructureHost inventoryCreate infrastructure environment. Once created, you can add BMHs by clicking on Add hostsWith BMC form within the InfraEnv view.

Adding Nodes to our Inventory

  1. Let’s create a namespace to store our hardware inventory.

    In this lab, hardware-inventory is the name we decided to use for the namespace that holds the hardware inventory. This name is not required and you could use any namespace name you want.
    oc --kubeconfig ~/hypershift-lab/mgmt-kubeconfig create \
        namespace hardware-inventory
    namespace/hardware-inventory created
  2. We need a valid pull secret, run the following command to copy the management cluster pull secret to our hardware inventory namespace.

    oc --kubeconfig ~/hypershift-lab/mgmt-kubeconfig -n openshift-config get \
        secret pull-secret -o yaml | grep -vE "uid|resourceVersion|creationTimestamp|namespace" \
        | sed "s/openshift-config/hardware-inventory/g" | oc --kubeconfig ~/hypershift-lab/mgmt-kubeconfig \
        -n hardware-inventory apply -f -
    secret/pull-secret created
  3. Create the InfraEnv.

    The private SSH key can be found in the lab materials here. You shouldn’t need this key for the lab.
    cat <<EOF | oc --kubeconfig ~/hypershift-lab/mgmt-kubeconfig apply -f -
    ---
    apiVersion: agent-install.openshift.io/v1beta1
    kind: InfraEnv
    metadata:
      name: hosted
      namespace: hardware-inventory
    spec:
      additionalNTPSources:
      - 192.168.125.1
      pullSecretRef:
        name: pull-secret
      sshAuthorizedKey: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQC5pFKFLOuxrd9Q/TRu9sRtwGg2PV+kl2MHzBIGUhCcR0LuBJk62XG9tQWPQYTQ3ZUBKb6pRTqPXg+cDu5FmcpTwAKzqgUb6ArnjECxLJzJvWieBJ7k45QzhlZPeiN2Omik5bo7uM/P1YIo5pTUdVk5wJjaMOb7Xkcmbjc7r22xY54cce2Wb7B1QDtLWJkq++eJHSX2GlEjfxSlEvQzTN7m2N5pmoZtaXpLKcbOqtuSQSVKC4XPgb57hgEs/ZZy/LbGGHZyLAW5Tqfk1JCTFGm6Q+oOd3wAOF1SdUxM7frdrN3UOB12u/E6YuAx3fDvoNZvcrCYEpjkfrsjU91oz78aETZV43hOK9NWCOhdX5djA7G35/EMn1ifanVoHG34GwNuzMdkb7KdYQUztvsXIC792E2XzWfginFZha6kORngokZ2DwrzFj3wgvmVyNXyEOqhwi6LmlsYdKxEvUtiYhdISvh2Y9GPrFcJ5DanXe7NVAKXe5CyERjBnxWktqAPBzXJa36FKIlkeVF5G+NWgufC6ZWkDCD98VZDiPP9sSgqZF8bSR4l4/vxxAW4knKIZv11VX77Sa1qZOR9Ml12t5pNGT7wDlSOiDqr5EWsEexga/2s/t9itvfzhcWKt+k66jd8tdws2dw6+8JYJeiBbU63HBjxCX+vCVZASrNBjiXhFw==
    EOF
  4. The InfraEnv must have generated an ISO, let’s check it.

    oc --kubeconfig ~/hypershift-lab/mgmt-kubeconfig -n hardware-inventory \
        get infraenv hosted
    It can take up to 1 minute for the ISO to be created.
    NAME     ISO CREATED AT
    hosted   2023-06-07T08:51:24Z
  5. At this point we are ready to boot our bare metal nodes with this ISO, in order to so we need to create a few BareMetalHost objects.

    Below command will create some Secrets with credentials for our BMCs, the different BareMetalHosts and a Role for the HyperShift operator to be able to manage the agents. You can see how we are referencing our InfraEnv in the BareMetalHosts by using a custom label infraenvs.agent-install.openshift.io: hosted, that will make sure our nodes are booted with the ISO generated.
    cat <<EOF | oc --kubeconfig ~/hypershift-lab/mgmt-kubeconfig apply -f -
    apiVersion: v1
    kind: Secret
    metadata:
      name: hosted-worker0-bmc-secret
      namespace: hardware-inventory
    data:
      password: YWRtaW4=
      username: YWRtaW4=
    type: Opaque
    ---
    apiVersion: v1
    kind: Secret
    metadata:
      name: hosted-worker1-bmc-secret
      namespace: hardware-inventory
    data:
      password: YWRtaW4=
      username: YWRtaW4=
    type: Opaque
    ---
    apiVersion: v1
    kind: Secret
    metadata:
      name: hosted-worker2-bmc-secret
      namespace: hardware-inventory
    data:
      password: YWRtaW4=
      username: YWRtaW4=
    type: Opaque
    ---
    apiVersion: metal3.io/v1alpha1
    kind: BareMetalHost
    metadata:
      name: hosted-worker0
      namespace: hardware-inventory
      labels:
        infraenvs.agent-install.openshift.io: hosted
      annotations:
        inspect.metal3.io: disabled
        bmac.agent-install.openshift.io/hostname: hosted-worker0
    spec:
      automatedCleaningMode: metadata
      bmc:
        disableCertificateVerification: True
        address: redfish-virtualmedia://192.168.125.1:9000/redfish/v1/Systems/local/hosted-worker0
        credentialsName: hosted-worker0-bmc-secret
      bootMACAddress: aa:aa:aa:aa:02:01
      online: true
    ---
    apiVersion: metal3.io/v1alpha1
    kind: BareMetalHost
    metadata:
      name: hosted-worker1
      namespace: hardware-inventory
      labels:
        infraenvs.agent-install.openshift.io: hosted
      annotations:
        inspect.metal3.io: disabled
        bmac.agent-install.openshift.io/hostname: hosted-worker1
    spec:
      automatedCleaningMode: metadata
      bmc:
        disableCertificateVerification: True
        address: redfish-virtualmedia://192.168.125.1:9000/redfish/v1/Systems/local/hosted-worker1
        credentialsName: hosted-worker1-bmc-secret
      bootMACAddress: aa:aa:aa:aa:02:02
      online: true
    ---
    apiVersion: metal3.io/v1alpha1
    kind: BareMetalHost
    metadata:
      name: hosted-worker2
      namespace: hardware-inventory
      labels:
        infraenvs.agent-install.openshift.io: hosted
      annotations:
        inspect.metal3.io: disabled
        bmac.agent-install.openshift.io/hostname: hosted-worker2
    spec:
      automatedCleaningMode: metadata
      bmc:
        disableCertificateVerification: True
        address: redfish-virtualmedia://192.168.125.1:9000/redfish/v1/Systems/local/hosted-worker2
        credentialsName: hosted-worker2-bmc-secret
      bootMACAddress: aa:aa:aa:aa:02:03
      online: true
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      name: capi-provider-role
      namespace: hardware-inventory
    rules:
    - apiGroups:
      - agent-install.openshift.io
      resources:
      - agents
      verbs:
      - '*'
    EOF
    secret/hosted-worker0-bmc-secret created
    secret/hosted-worker1-bmc-secret created
    secret/hosted-worker2-bmc-secret created
    baremetalhost.metal3.io/hosted-worker0 created
    baremetalhost.metal3.io/hosted-worker1 created
    baremetalhost.metal3.io/hosted-worker2 created
    role.rbac.authorization.k8s.io/capi-provider-role created
  6. After a few moments then BareMetalHosts must have moved to Provisioned state.

    oc --kubeconfig ~/hypershift-lab/mgmt-kubeconfig -n hardware-inventory get bmh
    It can take up to 2 minutes for the BMHs to move to provisioned state. Sometimes you may see one of the BMHs go into Provisioning error, this error should be recovered automatically.
    NAME             STATE         CONSUMER   ONLINE   ERROR   AGE
    hosted-worker0   provisioned              true             91s
    hosted-worker1   provisioned              true             90s
    hosted-worker2   provisioned              true             90s
  7. At this point the bare metal nodes are being booted with the discovery ISO and we should get some Agents created.

    This can take up to 5 minutes. Keep running the command from time to time.
    oc --kubeconfig ~/hypershift-lab/mgmt-kubeconfig -n hardware-inventory get agent
    It can take up to 10 minutes for the nodes to boot and show up as agents.
    NAME                                   CLUSTER   APPROVED   ROLE          STAGE
    aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaa0201             true       auto-assign
    aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaa0202             true       auto-assign
    aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaa0203             true       auto-assign

At this point we have our bare metal nodes ready to be used as worker nodes by Hosted Control Planes, in the next section we will create a Hosted Cluster.