Tuesday, March 18, 2014

Exadata Upgrade Cisco Switch Failure

I just worked on a production Exadata X2 Full stack upgrade this past weekend this is something Viscosity specializes in and are experts in this arena. We can actually do a Full Complete Exadata Stack and even ZFS Storage upgrade of ALL components and ensure they are up to date with the latest firmware and patches and actually suggest to do this at least every 6 months or once a year if you can get the downtime.
Everything went very well in our upgrade and I thought we were done and ready to home after upgrading the firmware on the Cisco switch which was one of the last items to upgrade. When we applied the firmware update from a remote location we realized there was a problem when the switch did not come online after a few minutes, also there was no response from a ping or telnet either. We had no choice at this point but to physically get to the Datacenter ASAP and check the status of the switch since there is no ILOM or KVM to control the management of the switch.
All Exadata systems contain a management network running on at least 1000BASE-T (also known as IEEE 802.3ab) is a standard for gigabit Ethernet or 1Gb Ethernet. All servers (database and storage) have a connection to this network, along with each server’s Integrated Lights Out Management (ILOM) card. The Infiniband switches also have connections to the management network, all of which is then routed through the Cisco Catalyst 4948 switch which has 48 ethernet ports. The KVM switch console on the X2-2 and PDUs also have optional connectivity to the management network and are also connected to the switch.

When I got to the Datacenter and looked at the Cisco Switch from the back I noticed an Amber colored light for the Status LED light which indicates there is a system fault. I also turned off and unplugged both power cords from the front of the switch and waited about a minute as per Oracle Support’s instructions to power cycle the switch and that also did not bring the switch online, a telnet and ping did not return anything.
Cisco-4948_status_lights



Then we thought of connecting to the switch via the Console port via a serial cable to see if we could manage the switch and get a command prompt to get the status. You cannot simply use an RJ45 ethernet cable to connect from your laptop to the switch’s Console port. You actually need to connect using a USB to Serial to ethernet cable as shown below.
usb_to_serial


Once you connect to the console port you will see a Serial device on your computer and check the properties to see its COM port number, please note this since it will be used to connect directly to the switch.
Steps to connect to the Cisco Switch from your computer.
  1. You need to find out which com port your prolific usb to serial cable is connected to on your laptop.
  2. Connect to the Console port of the Cisco switch not the Management Port, image shown below and I circled the correct port, the one below which is labeled MGT did not work.
  3. 9600 is usually your connect speed. Use a terminal program such as putty and note your com port and set the baud rate.
  4. Enter return and you should see the prompt.
  5. Type in enable to get a prompt and enter the switch password.
  6. Now issue the boot command to startup the switch.

cisco_4948_circle_port

Once the Cisco Switch came online we were able to proceed and complete the finally complete the upgrade with the new firmware which allows ssh connectivity and you can also optionally disable telnet if you have strict security guidelines which does not allow it.

No comments:

Post a Comment