OCP Upgrade Process Flow - Continued
Step 9: Upgrade OLM Operators
Check to see which operators need to be upgraded:
$ oc get installplan -A | egrep 'APPROVED|false'
NAMESPACE NAME CSV APPROVAL APPROVED
metallb-system install-nwjnh metallb-operator.v4.13.0-202311031531 Manual false
openshift-nmstate install-5r7wr kubernetes-nmstate-operator.4.13.0-202311021930 Manual false
Then patch the installplans for those operators:
$ oc patch installplan -n metallb-system install-nwjnh --type merge --patch \
'{"spec":{"approved":true}}'
installplan.operators.coreos.com/install-nwjnh patched
Now monitor the namespace:
Right after patch:
$ oc get all -n metallb-system
NAME READY STATUS RESTARTS AGE
pod/metallb-operator-controller-manager-69b5f884c-8bp22 0/1 ContainerCreating 0 4s
pod/metallb-operator-controller-manager-77895bdb46-bqjdx 1/1 Running 0 4m1s
pod/metallb-operator-webhook-server-5d9b968896-vnbhk 0/1 ContainerCreating 0 4s
pod/metallb-operator-webhook-server-d76f9c6c8-57r4w 1/1 Running 0 4m1s
…
NAME DESIRED CURRENT READY AGE
replicaset.apps/metallb-operator-controller-manager-69b5f884c 1 1 0 4s
replicaset.apps/metallb-operator-controller-manager-77895bdb46 1 1 1 4m1s
replicaset.apps/metallb-operator-controller-manager-99b76f88 0 0 0 4m40s
replicaset.apps/metallb-operator-webhook-server-5d9b968896 1 1 0 4s
replicaset.apps/metallb-operator-webhook-server-6f7dbfdb88 0 0 0 4m40s
replicaset.apps/metallb-operator-webhook-server-d76f9c6c8 1 1 1 4m1s
Once it is complete it should look like this:
[kni@utility ~]$ oc get all -n metallb-system
NAME READY STATUS RESTARTS AGE
pod/metallb-operator-controller-manager-69b5f884c-8bp22 1/1 Running 0 25s
pod/metallb-operator-webhook-server-5d9b968896-vnbhk 1/1 Running 0 25s
…
NAME DESIRED CURRENT READY AGE
replicaset.apps/metallb-operator-controller-manager-69b5f884c 1 1 1 25s
replicaset.apps/metallb-operator-controller-manager-77895bdb46 0 0 0 4m22s
replicaset.apps/metallb-operator-webhook-server-5d9b968896 1 1 1 25s
replicaset.apps/metallb-operator-webhook-server-d76f9c6c8 0 0 0 4m22s
Verify that the operators don’t need to update more then once:
$ oc get installplan -A | egrep 'APPROVED|false'
NAMESPACE NAME CSV APPROVAL APPROVED
| Sometimes you will need to approve an update twice because some operators have interim z-release versions that need to be stepped through. |
If-Then GO TO:
If you are performing Y-strem or Z-stream upgrades than you can skip to “Un-Pause the worker nodes”
Step 10: Second Y-stream update
Now we need to upgrade the Y-stream control plane version to the new EUS version.
First we will verify the 4.Y.Z release listed in Step 1 is still listed as a good channel to move to:
[cnf@utility ~]$ oc adm upgrade
Cluster version is 4.13.32
Upgradeable=False
Reason: AdminAckRequired
Message: Kubernetes 1.27 and therefore OpenShift 4.14 remove several APIs which require admin consideration. Please see the knowledge article https://access.redhat.com/articles/6958395 for details and instructions.
Upstream is unset, so the cluster will use an appropriate default.
Channel: eus-4.14 (available channels: candidate-4.13, candidate-4.14, eus-4.14, fast-4.13, fast-4.14, stable-4.13, stable-4.14)
Recommended updates:
VERSION IMAGE
4.13.33 quay.io/openshift-release-dev/ocp-release@sha256:7083519
Additional updates which are not recommended, or where the recommended status is "Unknown", for your cluster configuration are available, to view those re-run the command with --include-not-recommended.
| If you upgrade early/soon after initial GA of a new Y-release, you may not see any new Y-releases available when you run the ‘oc adm upgrade’ command. However, you will see that you can choose the flag “--include-not-recommended” which will allow you to see releases that are not recommended. Which will look like the following: |
[cnf@utility ~]$ oc adm upgrade --include-not-recommended
Cluster version is 4.13.32
Upgradeable=False
Reason: AdminAckRequired
Message: Kubernetes 1.27 and therefore OpenShift 4.14 remove several APIs which require admin consideration. Please see the knowledge article https://access.redhat.com/articles/6958395 for details and instructions.
Upstream is unset, so the cluster will use an appropriate default.
Channel: eus-4.14 (available channels: candidate-4.13, candidate-4.14, eus-4.14, fast-4.13, fast-4.14, stable-4.13, stable-4.14)
Recommended updates:
VERSION IMAGE
4.13.33 quay.io/openshift-release-dev/ocp-release@sha256:7083519fd7
Supported but not recommended updates:
Version: 4.14.12
Image: quay.io/openshift-release-dev/ocp-release@sha256:671bc35e
Recommended: Unknown
Reason: EvaluationFailed
Message: Exposure to AzureRegistryImagePreservation is unknown due to an evaluation failure: invalid PromQL result length must be one, but is 0
In Azure clusters, the in-cluster image registry may fail to preserve images on update. https://issues.redhat.com/browse/IR-461
As you can see this specifies that an Azure cluster could have an issue. However, it does not show any potential risk for a baremetal cluster. Therefore, unless you are running in an Azure cluster you should be able to upgrade without any issues.
Admin Acknowledge
When moving between Y-stream releases you will need to run the patch command to acknowledge that you are willing to move. In the output of the “oc adm upgrade” command it will show you a URL (https://access.redhat.com/articles/6958395) that will give you the specific command to run.
[cnf@utility ~]$ oc -n openshift-config patch cm admin-acks --patch '{"data":{"ack-4.13-kube-1.27-api-removals-in-4.14":"true"}}' --type=merge
configmap/admin-acks patched
Start Y-stream Control Plane Upgrade
Once you have determined the full new release that you are moving to (from the above commands), you can run the “oc adm upgrade –to=x.y.z” command.
[cnf@utility ~]$ oc adm upgrade --to=4.14.11
Requested update to 4.14.11
| You may be moving to a z-release that (as stated above) may have potential issues with platform’s other than the one you are running on. Here is an example of the output and how to work with it: |
[cnf@utility ~]$ oc adm upgrade --to=4.14.11
error: the update 4.14.11 is not one of the recommended updates, but is available as a conditional update. To accept the Recommended=Unknown risk and to proceed with update use --allow-not-recommended.
Reason: EvaluationFailed
Message: Exposure to AzureRegistryImagePreservation is unknown due to an evaluation failure: invalid PromQL result length must be one, but is 0
In Azure clusters, the in-cluster image registry may fail to preserve images on update. https://issues.redhat.com/browse/IR-461
[cnf@utility ~]$ oc adm upgrade --to=4.14.11 --allow-not-recommended
warning: with --allow-not-recommended you have accepted the risks with 4.14.11 and bypassed Recommended=Unknown EvaluationFailed: Exposure to AzureRegistryImagePreservation is unknown due to an evaluation failure: invalid PromQL result length must be one, but is 0
In Azure clusters, the in-cluster image registry may fail to preserve images on update. https://issues.redhat.com/browse/IR-461
Requested update to 4.14.11
Step 11: Monitor the upgrade
Using the following command you will get this output to monitor the progress of the upgrade.
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.13.32 True True 9m48s Working towards 4.14.11: 118 of 860 done (13% complete), waiting on kube-apiserver
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.13.32 True False False 3d15h
baremetal 4.13.32 True False False 35d
cloud-controller-manager 4.13.32 True False False 35d
cloud-credential 4.13.32 True False False 35d
cluster-autoscaler 4.13.32 True False False 35d
console 4.13.32 True False False 34d
...
config-operator 4.14.11 True False False 35d
etcd 4.14.11 True False False 35d
kube-apiserver 4.14.11 True True False 35d NodeInstallerProgressing: 1 nodes are at revision 21; 2 nodes are at revision 23
NAME STATUS ROLES AGE VERSION
ctrl-plane-0 Ready control-plane,master 35d v1.26.13+77e61a2
ctrl-plane-1 Ready control-plane,master 35d v1.26.13+77e61a2
ctrl-plane-2 Ready control-plane,master 35d v1.26.13+77e61a2
worker-0 Ready mcp-1,worker 35d v1.25.14+a52e8df
worker-1 Ready mcp-2,worker 35d v1.25.14+a52e8df
NAMESPACE NAME READY STATUS RESTARTS AGE
openshift-kube-apiserver kube-apiserver-ctrl-plane-0 0/5 Pending 0 <invalid>
During the upgrade it will cycle through one or several of the cluster operators at a time, giving you a status of the operator upgrade in the “MESSAGE” section of ‘oc get co’.
Once it is done with all of the cluster operators, it will reboot each of the control plane nodes (one at a time). During this part of the upgrade you will see a lot of the cluster operators give messages that look like they are upgrading again or are in a degraded state because the control plane node is offline.
As soon as the last control plane node is complete, the cluster version will show as upgraded to the new EUS release.
It should look like this:
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.14.11 True False 39m Cluster version is 4.14.11
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.14.11 True False False 3d17h
baremetal 4.14.11 True False False 36d
cloud-controller-manager 4.14.11 True False False 36d
cloud-credential 4.14.11 True False False 36d
cluster-autoscaler 4.14.11 True False False 36d
config-operator 4.14.11 True False False 36d
console 4.14.11 True False False 35d
...
operator-lifecycle-manager-packageserver 4.14.11 True False False 35d
service-ca 4.14.11 True False False 36d
storage 4.14.11 True False False 36d
NAME STATUS ROLES AGE VERSION
ctrl-plane-0 Ready control-plane,master 35d v1.27.10+28ed2d7
ctrl-plane-1 Ready control-plane,master 36d v1.27.10+28ed2d7
ctrl-plane-2 Ready control-plane,master 36d v1.27.10+28ed2d7
worker-0 Ready mcp-1,worker 35d v1.25.14+a52e8df
worker-1 Ready mcp-2,worker 35d v1.25.14+a52e8df
Step 12: Upgrade All of the OLM Operators
This time we will need to not only approve all of the operators as before but we will also need to add install plans for any other operators that we want to upgrade. The list of operators and specific change to operators is currently not in scope for this revision but will be added soon.
Please follow the same steps as before, in Step 8. Then check with all of your Operator vendors or Operator pages to see if any other operators need to be updated.