Cisco Switching/Routing :: 7609-S Random High Cpu Spike With LMS?
Sep 2, 2012
I am having a cisco 7609-S router with the image sup-bootdisk:c7600rsp72043-ADVIPSERVICESK9-mz.122-33.SRD7.bin. Im getting above 95% cpu spike for 10 mins in every 2hrs of interval, after it has been integrated with Ciscoworks LMS version 3.2.1 and added this to HUM. In LMS the HUM polling has been configured for every 1hr to monitor the cpu and memory utilization.
I have gone through the below cisco document and I have configured all of the snmp-server view commands as illustrated in the below website.[URL] After configuring the same, now we are observing the cpu spike every 4 hrs or i would say 5 hrs... I have checked all the settings of ciscoworks LMS and there is absolutely no other jobs are running in the day time... The spike comes every 4 hrs.
We have a Catalyst 3560G 24 port POE switch. It's been running fine for 1+ years. A few weeks ago we enabled SPAN on it to capture packets. Today, we had a random spike in CPU on the switch. Seems hardware swithing continued to work fine, but software based processes choked and effectively took down EIGRP, HSRP, etc. We collect syslogs from the router and we saw 2 crashes/reboots. Both showed the exact same error both times, with the same hex values. I **believe** the CPU usage dropped when a tech disconneted the SPAN port and it's state changed to down, but I'm not 100% sure.Could this indicate an IOS bug (I'm hoping it's not a hardware failure)? And, how to track this down to see if this could be related to SPAN? I've disabled SPAN for now.
we have a pair of 7609 routers working in Active Standby mode. The version of routers are :- Cisco IOS Software, c7600rsp72043_rp Software (c7600rsp72043_rp-ADVIPSERVICESK9-M), Version 12.2(33)SRD4, RELEASE SOFTWARE (fc2)
Quiet frequently we are getting High CPU load problems on our router and this load comes down below 10 % automatically.
I am attaching the Show tech support for your kind consideration.
Similar issue was reported yesterday as well for which i have attached SH tech suppot.
I have a hight CPU utilisation problem in my CISCO7609-S routers. the cpu utilisation can rise 99% et this is usually. In the moment of hight CPU the the process CPU give the following:
I am facing an isssues with 7609 for LAN switching , based on LAN (VRRP/HSRP) feature.Actually we are having ES+ cards (on 7609) and we are using multiple groups(say 350 vrrp groups) running on the router . the routers are connected as router 1>>> mux(which is working as switches)>>> router2
my questing are
1. does their will be "multicast packets" (for VRRP/HSRP group) "from backup router to Master router", when in stable state( ie when Master and backup are already chosen) , or the packet from backup to master should be unicast.I know for sure, the packet from master to back is multicast packets denstination to Multicast IP packet and To MAC address.I am not sure but I think from backup to master it should be multicast
2. what is frequency of these packets( from backup to master)
3. As i have multiper group on a single interface ( we are using q-in-q), when the connectivity from router's is broken, then does all the groups will muticast their active roll in the lan sengment "at once" or it will be in a groups say 100 groups at once, and after few ms few 100's and sone ( as is on OSPF or RIP)
we are in between troubleshooting I hope we get the ans( Actul problem we are seeing in the router's that we have 2 ports on active routers and 2 ports on standby router , but we are not seeing muticast on 1 port on standby router where as all other 3 ports are seeing multicast packets) [code]
On a pair of my CISCO7609-s (engine:sup720-3B IOS Version:12.2(33)SRD4),some interfaces is configured as routing interface but also them are attend MSTP caculation and i really caught BPDU packet go out from these ports. [code]
I faced multicast routing problem - There is no multicast UDP stream toward host, although all igmp Joins/Reports/Leaves from the host are correct.
This is Cisco 7609 ip m route debug showing the situation: 369144: Nov 15 15:03:07.370 MSK: IGMP(0): Received v2 Report on Vlan176 from 10.XXX.XX.184 for 239.XXX.XX.46 369145: Nov 15 15:03:07.370 MSK: IGMP(0): Received Group record for group 239.XXX.XX.46, mode 2 from 10.XXX.XX.184 for 0 sources [code]....
I've Cisco7609-S with IOS 12.2(33)SRC2 met an issue is that "show ip route x.x.x.x" and "show ip cef x.x.x.x" shown next-hop is not actual switched next-hop.
For example, "show ip route 192.168.1.1" and "show ip cef 192.168.1.1" shown correct next-hop is 10.1.1.1, but the traffic destine to 192.168.1.1 actually not through 10.1.1.1, but always through the default route next-hop. Everything works normal after rebooted the router. Suppose it should caused by a bug? BTW, my Cisco7609 is runing BGP with ISP which received about 10K routes.
I have been having a high ping issue both via a cable and wireless. When i connect via wireless, it is generally 2 ms, sometimes goes up to 4 and 5 and rarely i get packet losses. Via a cable, i get usually 1 ms, and sometimes 2 ms. It generates a problem while online gaming and btw i tried changing channels and it didnt work. I even get same pings when i am 3 meters away from the router compared to less than 1 ms pings.
The top device of my network is cisco router 7609. There are two part subnet of my network, each part use same device type, same running-configs and same network topple: sw6506(to campus)--->sw3560(to buildings)<--->linksys sr324(to offices). IP addresses for manager vlan is 192.168.1.0/24.Suppose we name two part subnet as A and B. the problem is from 7609 I can telnet to every device of part A quickly, but when telnet to each sw3560 of part B,it responses very slowly. And only sw3560 of part B are response slowly, other devices of part B are ok.If I telnet to linksys sr324 first, then from linksys sr324 telnet to the current sw3560, it's ok.I try to capture packets of manage vlan, but there seems no strange things in it.No users of part B report problems, it seems the network is running well. Compare two sw6506s, the only diffirent thing is, there are "overrun" count at each interface in use of part B's sw6506. Each interface traffic is far less than it's capability, but it's "overun" count still increasing at working hours everyday.
I have a 7609 with two LAG groups (Etherchannel not LACP) going to two separate devices that DO NOT participate in spanning-tree (Occam gear if you must know). I'm running 802.1w across the LAG groups but the convergence time is terrible! In essence, the 7609 is running spanning tree against itself (between the two blades). What can I do to fix my configuration?
If I disable the ports on the equipment connected to g1/17 & g1/18, it takes ~30 seconds for spanning tree to start forwarding on g2/17 & g2/18.When I bring the ports back up on the other side of g1/17 & g1/18 which are in a lag group, g2/17 & g2/18 immediately go into block mode while g1/17 & g1/18 start learning for another ~30 seconds!! [code]
This switch randomly reboots throughout the day. I checked the stacks info and reported it was using crashinfo_12 (report below). I have access to the switch throughout the day if more config info needs to exported.
Cisco IOS Software, C3560 Software (C3560-IPBASEK9-M), Version 12.2(50)SE1, RELEASE SOFTWARE (fc2) Copyright (c) 1986-2009 by Cisco Systems, Inc.
We have computers that are connected to a switch stack of 3 - 3750 switches. Randomly, we experience pcs that fail to communicate on the network. At first thought I figured the port went into err-disabled state, however the port shows up fine on the switch and moving the pc to another port on the same switch in the stack fails to fix the problem. To add to the confusion, if I immediately connect a different machine into the problematic port the newly connected machine has no issue and operates normally. Connecting back the first machine still results in no connectivity.
The only way to gain back network connectivity is to move the pc to a different switch in the stack. shut/no shut doesn't work.The IOS the stack is running is 12.2 and the switch ports are configured using cisco port macros.
Got a long lingering, year long issue that has spanned about 8 supervisor cards and a complete chassis swap. The 6509 acts as a ITN in our facility. The active sup card at random points of pipe usage boots into rommon mode, seriously inhibiting our company. I'm able to swap the 2 fiber pairs that we had going into the active supervisor card into the secondary and usually this works for another random amount of time, however, today it occured within minutes of hooking up the fiber links. Sitting there for about 5 minutes it booted into ROMMON. When this happens, I'm able to boot the sup card back to good status. Previous remedial actions, other than replacing sup cards/chassis, was checking the config register and making sure it was x2102. Previously, it was not, and we corrected and reloaded and it took, we thought this would fix the problem, until today.
All of our VoIP phones bounced (re-registered) in the middle of the work day and we do not know why. There is no maintenance going on at this time. The phones came back up right away. Upon further investigation, I only found one error message on our core switch (WS-6509 running IOS ver. 12.2(18)SXF13) and I'm wondering if that was the cause:
Jan 3 13:51:04.030: %SIBYTE-SP-5-SB_OUT_OF_RX_DSCR_CH1: Out of RX descriptors on mac 0 - channel 1 (count: 299)
I cannot find any other instances of this error message from the internet.
We currently have around 150 2975 switches and have had problems with it them not handing out PoE power to the cisco phones and access points at random times. There is plenty of power left for the switch to use. We have at least 15 that will be running fine for about a week and then all of the devices that use PoE power will shut off and will not come back on until we reload the switch. If you console in there aren't any messages that pop up and if you look at the port it just shows on connected or will show IeeePD in the power inline. We have contacted Cisco TAC and they just RMA them.
I am seeing a strange situation on my 6500 switch?By having snmp walk on '1.3.6.1.4.1.9.9.109.1.1.1.1.3' (== cpmCPUTotal5sec), I came to know that there are two processor and the cpu util for switching processor is gone to 88 % and some time creeps to 99 %.
snmpwalk -v2c -c "removes" sw6500 '1.3.6.1.4.1.9.9.109.1.1.1.1.3' SNMPv2-SMI::enterprises.9.9.109.1.1.1.1.3.1 = Gauge32: 12 (--- this is for CPU of Router Processor ) SNMPv2-SMI::enterprises.9.9.109.1.1.1.1.3.3 = Gauge32: 99 (--- this is for CPU of Switching Processor )
but when I do sh process cpu on the console, all looks normal as it shows cpu utilization of RP. why the value is so high on the switching processor ?
I have 2 6509-E chassis with SUP-720-VSS and classic line cards :-(. on October 2011 the switch reached 100% CPU on both devices and the entire network went down. Customer restarted the core so we lost all the log files and couldnt find out any root cause on the same. TAC engineer suggested to have some script configured on the system in case of CPU shooting up above 70%, it will create a file in flash and keep appending the logs to the same. Last week i got call from customer saying that the CPU again went high for around a minute on both the cores. Last time i added CoPP also on the switch in order to prevent the CPU reaching 100%. Still it went high and from the captured logs i saw that the process created the high CPU was Port Manager Per and SSH process. Attached the file created by the netdr capture command.
The fans 1 & 2 in Module 1 on the Nexus5K are still experiencing the very high RPM and speed issue.
I have replaced the fan from another operational Nexus5K, and the fans are fine in the other Nexus. The replacement fans also have the same issues, so it is not a fan hardware issue.
There are no threshold alarms. the only log entry that is related to this is as follows:
%NOHMS-2-NOHMS_ENV_ERR_FAN_SPEED: System minor alarm in fan tray 1: fan speed is out of range on fan 1. 7950 to 12500 rpm expected. I have provided the output for both the fan detail and the temperature.
N5K-01# sh environment fan detail Fan: --------------------------------------------------- Module Fan Airflow Speed(%) Speed(RPM) Direction --------------------------------------------------- 1 1
Currently, my Cisco 3750x (2 switches stacking) is having very hight CPU. Below are some of the output :
3750-ANA#sh processes cpu sorted | ex 0.00 CPU utilization for five seconds: 55%/30%; one minute: 55%; five minutes: 55% PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
[Code]....
I am not sure what is using CPU to go up like this. I had tried some troubleshooting guide on Cisco web site like "Troubleshooting high CPU Utilization"
My 4500 core always in 60 % cpu utility , and when i run #sh proc cpu sorted i find that 55 29725041543795572214 0 39.43% 41.40% 41.39% 0 Cat4k Mgmt LoPri
Which mean that this process is the top one , and when i run #sh platform health i found that Stub-JobEventSchedul 10.00 15.98 10 64 100 500 20 17 12 29269:55 K2 CpuMan Review 30.00 35.60 30 48 100 500 49 46 32 52390:52
Those two process are the top and they already exceed their maximum rang and when i run #sh platform cpu packet statistics i can find that Packets Received by Packet Queue
We have a Cisco 3845 router configured as a voice gateway with multi SIP trunks. But when it reachs 200 calls traffic, the CPU increase to 60-70% and caused by CCSIP_SPI_CONTROL process.
CPU utilization for five seconds: 46%/30%; one minute: 54%; five minutes: 58% PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process 377 400729448 171017979 2343 6.31% 10.71% 12.44% 0 CCSIP_SPI_CONTRO
my 3750-E Core Stack is connected to the Provider Router and is the DG for the internal LAN. I saw that the CPU is very high also in the night, but I found not the problem. I use an SVI to connect the provider due to HA reasons. I sniffered the network but saw no ecessive broadcaststorms. There was a PBR configured but I deleted it wihtout any success..
We are facing high CPU Utilization on Cisco 3750X-48P-L without any traffic on it. find the attached log files for 2 separate 3750's stack, we have upgraded the IOS of SW2 from "c3750e-universalk9-mz.122-55.SE3.bin" to "c3750e-universalk9-mz.122-55.SE4.bin" but still we found the same issue with CPU utilization.
I've been looking at reported problems with our Vdeio Conferencing kit attched to a stack of 3750's (which I think is down to QoS) but this got me looking at the logs. We get a lot of high CPU utilization warnings mainly for SNMP (315), Hulc running con (95), Virtual exec (289). I understand the last two are normal, and the SNMP one is probably Cisco Works polling as it happens every 4 hours.
However I've got an odd one: Apr 25 07:34:58: %SYS-1-CPURISINGTHRESHOLD: Threshold: Total CPU Utilization(Total/Intr): 93%/0%, Top 3 processes(Pid/Util): 296/85%, 144/0%, 154/0%