What the… : Weirdness/Oddities Encountered in the Past Week(ish) #1
I have decided to start writing down some of the weird networking things I encounter in my day job. I am hoping that it helps others fix issues in their own environments; or at least give you a chuckle. 🙂
/31 Gotchas on Cisco\Viptela Equipment
- Weirdness: I ran into this last fall & then promptly forgot about it. /31 subnet masks are supported on the Cisco (formerly Viptela) SD-WAN gear and have worked great for me for the past 18 months. Except if you are using vEdge 5000s. vManage will let you configure the interfaces with a /31 and the devices will accept the config, but they will not pass any traffic.
- Investigation: I haven’t seen this issue on vEdge 100s, 1000s, or any converted ISRs. Just on vEdge 5000s
- Fix: Changed the subnets to /30s and everything works. I can ping it now, so it has to be rock solid…
Routing issues for specific PI IPv4 address space
- Weirdness: Thanks to a predecessor in the 90s at $DayJob, we have a good amount of provider independent IPv4 space (especially for a mid-sized company). We ran into an issue where some users trying to access a SaaS application from one location (and one PI block) would get constant timeouts. If those same users used their cell phones, the application responded immediately. To cause further confusion, users at a second location (and second PI block) could access the application without issue.
- Investigation: Using WinMTR we tracked the path from our location (in the USA) to the SaaS application (hosted in Sweden). It appeared that somewhere in the UK we started having 50-80% packet loss. Performing this same test from the second location (and the adjoining /24 block), we did not see the same issue.
- Fix (sorta): Using a centralized traffic data policy on our Cisco SD-WAN equipment, we took the SaaS provider’s /17 network and set it to route out of a different Internet circuit at the first location, which was not using our PI address space. As soon as this policy was pushed to our vSmart(s) and vEdge(s), the webpage started responding immediately.
IP Phones dropping out
- Weirdness: Two call center employees reported that the screens on their Cisco IP Phone went dark momentarily and they were kicked out of Cisco Finesse multiple times over a 2 hour period.
- Investigation: Looking at the switch logs, the two associated switch ports did not log any up/down events, nor did they log a PoE removal/granted. Also, the two switch ports were on separate switches in the switch stack. Talking to the users, the phones were not going through a full reboot process. The screens were just going dark and then coming back. For one user, we replaced their patch cable & then complete phone, but they still had the issue. But, while going through the switch logs, we came across another switch port; again, on an entirely different switch in the stack; that was logging frequent PoE events.
14:37:12.163 UTC: %ILPOWER-5-POWER_GRANTED: Interface Gi3/0/27: Power granted (3750SWITCH-3)
14:37:12.691 UTC: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi3/0/27: PD removed (3750SWITCH-3)
14:37:28.778 UTC: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi3/0/27: PD removed (3750SWITCH-3)
14:37:45.254 UTC: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi3/0/27: PD removed (3750SWITCH-3)
14:38:01.553 UTC: %ILPOWER-5-POWER_GRANTED: Interface Gi3/0/27: Power granted (3750SWITCH-3)
14:38:01.905 UTC: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi3/0/27: PD removed (3750SWITCH-3)
14:38:18.462 UTC: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi3/0/27: PD removed (3750SWITCH-3)
14:38:34.702 UTC: %ILPOWER-5-POWER_GRANTED: Interface Gi3/0/27: Power granted (3750SWITCH-3)
14:38:34.828 UTC: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi3/0/27: PD removed (3750SWITCH-3)
14:38:50.791 UTC: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi3/0/27: PD removed (3750SWITCH-3)
- Fix: We went to the desk that was logging the PoE events & there we found that the user had NO IP PHONE (DUN, DUN, DUUUUNNN). They were using a softphone on their PC and their PC was plugged into the logged port. They were also using a 25 foot patch cable to plug their PC in the 5 feet they needed to reach the wall port. We replaced the cable with a 7 foot patch cable and the switch stopped logging the PoE events & the other users stopped reporting the issues with their phones.