I want to share a quick HA software update from a Cisco 9800-80 in HA with you. In this scenario we go from version 17.3.4EWS13 to version 17.3.4cEWS1.
Its always hard not to type upgrade but it isn't, this is only software and no hardware replacement is done here. I don't know why people always talk about upgrades when there is only software involved.
Maybe you thinking"EWS" releases? yes they are only available through TAC and not public available yet. This EWS13 release came out just before the public 17.3.4c release. EWS releases are like minor changes till it becomes public available.
Right after that a new bug was discovered and still present in release 17.3.4EWS13. CSCvz18383 SGT Bindings for Fabric Enabled SSIDs are not seen on Fabric Edge Switch . This bug basically makes SGT segmentation unusable, since we run a fabric enabled network with SGT segmention we are in need of this EWS1 release on top of the 17.3.4c release.
Start the update
First download the code and go to Administration > Softwaremanagement and upload the image through http
As soon as the download is complete it get copy to both nodes at the point the log will also be visible at the right side of the screen.
During this process this image will downloaded to the primary and secondary controller and to all AP's. After the image is installed on the standby it will reload and join back to the HA cluster.
Starting ISSU Install Activate Operation... install_activate: START Mon Jan 3 18:00:45 CET 2022 install_activate: Activating ISSU NOTE: Going to start Activate ISSU install process STAGE 0: System Level Sanity Check =================================================== --- Verifying install_issu supported --- --- Verifying standby is in Standby Hot state --- --- Verifying booted from the valid media --- --- Verifying AutoBoot mode is enabled --- --- Verifying Platform specific ISSU admission criteria --- Finished Initial System Level Sanity Check STAGE 1: Installing software on Standby =================================================== --- Starting install_remote --- [2] install_remote package(s) on chassis 2/R0 [2] Finished install_remote on chassis 2/R0 install_remote: Passed on [2/R0] Finished install_remote STAGE 2: Restarting Standby =================================================== --- Starting standby reload --- Finished standby reload --- Starting wait for Standby to reach terminal redundancy state ---
This will take long because approx 20-30 minutes, it will join in HA with 2 different WLC version's.
Standby update done
After the standby controller comes back it will run as the active controller, you can see the correct version number in the left upper corner.
If you go to Monitor > General (header) System > Redundancy (TAB), you can see its still syncing
After its done syncing it will start upgrading the AP's in batches of 25% , so 4 shots.
For P = 25%, expected number of iterations for all APs to upgrade ~ 6
The AP update statistics will show the progress of the AP update
The WLC keeps track of upgrade and joined or not. If there are some AP's that have struggle joining this process can take longer
Hitless Rolling AP Upgrade Algorithm
Status: Upgraded and not impacted..
The algorithm works in three stages.
1. Candidate AP Set Selection
First, a set of candidates are selected based on nearby APs information. Rolling AP Upgrade algorithm selects the configured percentage of APs to be upgraded in each iteration while maintaining RF coverage
For serving wireless clients. maintaining coverage is important and hence, it takes precedence over selecting the required number of APs. Therefore,
For P = 25%, expected number of iterations for all APs to upgrade ~ 6
For P = 15%, expected number of iterations for all APs to upgrade ~ 12
For P = 5%, expected number of iterations for all APs to upgrade ~ 22
Clients on the candidate APs are steered to APs which are not in the candidate list prior to rebooting the candidate APs. If the clients still persist on the candidate APs, they will just be sent a de-authentication frame and the AP will reload with the new image.
3. AP Re-load and Re-join
Post the client steering stage, the AP is reloaded with the new image.
At this point, a 3-minute timer is started for the APs to join back. When this timer expires, all candidate APs are checked and marked for the WLC they have connected to (self or the peer).
If at least 90% of the candidate APs have joined back, the iteration is concluded. If not, 3 minutes window is extended and the check is repeated for two more times until the count hits at least 90%.
At the end of the 3rd try, the iteration is concluded anyway and the next iteration is initiated. Hence, each iteration may last for at most 10 mins.
After the update is done commit it on the GUI so it doesn't rollback after the rollback timers expire. Then use the redundancy force-switchover command on the WLC CLI to make the primary controller back active