Howto upgrade your stackable 3750 with limited service disruption

When you have two or more 3750 in a Stack Configuration and need to upgrade the image, you will have a long service disruption (it can take about  7 – 10 minutes, because – depending on the image – the bootloader will be upgraded too and POST tests also take their time). However, if you have a redundant configuration, you can limit the downtime to a few seconds.

Requirements:

  • Redundant uplinks with fast reconvergence (e.g. RSTP)
  • Redundant downlinks on the edge (e.g. Port-Channel)

The Cisco way to do it

I digged through cisco.com, looking for a way to upgrade the boxes consecutively. They only thing I found was a Step-by-Step Instruction for dummies proposing a clean reload (both boxes will reload simultaneously).

The theory

The good news are, you can reload every single switch separately (reload slot x). In a prefect world, one would assume that you can simple reload every single switch consecutively, so that they will just boot up with the image. As a matter of fact, this may be possible, it depends – like so many things – on the IOS version.

What you need to check is the ‘stack protocol version’. You can check the stack version with the following command:

deadbeef-01#show platform stack-manager all
[...]
Stack State Machine View
=============================================
Switch Master/ Mac Address Version Current
Number Member (maj.min) State
---------------------------------------------
1 Master dead.beef.0001 1.41 Ready
2 Member dead.beef.0002 1.41 Ready

If the version is the same for the new IOS (e.g. 1.41 for 12.2(50)SE to 12.2(50)SE3), then you can just reload one switch after the other.

If this version however doesn’t match, you will have a problem. Your upgraded switch will join the stack, but will not participate in stack switching (because of stack protocol version mismatch). You probably assume that reloading the master will do the job – that’s not correct. If you reload the master when another switch is in ‘version mismatch’ state, the latter will reload too (don’t ask me why).

The solution

The solution is a bit tricky.

  • deploy the image on all switches (flash[1 – x])
  • set the boot variable with ‘boot system switch all flash:c3750-[…]’
  • shutdown all the ports on switch 1 (use the interface range command) and save the configuration to startup
  • power-off switch 1
  • disconnect both stack cables on switch 1
  • power-on switch 1 (and wait until the switch is up)
  • simultaneously shutdown the ports on switch 2 and enable the ports on switch 1. You need to be sure that the ports are not enabled simultaneously on both switches – otherwise you probably create a Layer 2 loop
  • save the configuration to startup on both switches
  • power-off switch 2
  • connect all stack cables
  • power-on switch 2 and check stack connectivity and state
  • enable all ports on switch 2

With this solution, I upgraded a stack of 2x 3750 with service disruption of less than 15 seconds (of course only on servers with redundant ports). With a simple reload, I would have needed approximately 8 minutes (from 12.2(44)SE6 to 12.2(50)SE3 – bootloader upgrade included).

7 Responses to “Howto upgrade your stackable 3750 with limited service disruption”

  1. Thanks a lot for this solution. I’m gonna prove it.

  2. Albert Bruggeman on May 24th, 2012 at 11:27

    Hi,

    I was looking for a method like this to downgrade some 3750-stacks from ios 15.0 to the latest 12.2.x (ios 15.0 gives us problems with spontaneous reloads, at least I’m suspecting the ios version).
    Was thinking in the direction you described, glad to read it worked for you. I will try it!

  3. Hi

    I was looking for painless solution to upgrade stack for a weeks and finally this is the best solution with minimal downtime.

    This is really works.

    Great and many many thanks. Keep it up.

    You not find this on Cisco site.

    Cheers.

  4. Just want to be clear on your information.

    If the version are the smae then just use the reload command.

    If the versions are different the follow your “Solution” steps correct?

  5. Yes, if the stack protocol version is the same you can simply reload one switch after another.

    You can do that with this command:
    reload slot "switch number"

  6. Awesome! I was hoping to find a way to do this… I will probably still plan on off hours but at least the traffic will still flow for the most part…

  7. Of course, you should always do this within a maintenance window. Also remember that servers and switches connected to a single stack member are out-of-service until you “no-shut” the ports; you only have the reduced downtime on devices dual-homed to more than one stack member.

Leave a Reply