Trouble in Internet Paradise

Internet connections at my house are considered critical infrastructure. I need to create change windows with my wife when I need to make disruptive network changes. When I have made disruptive changes without telling her, she noticed very quickly. So it was particular troubling when I was out of town, on only day 2 of a 4 day trip, when I get a text from her - “The Internet isn’t working”.

I run Ubiquiti equipment at home. The access points and switches are controlled by the UniFi controller. UniFi Controllers have the optional ability to be remotely accessed through their cloud, which is basically a proxy. I loaded the controller and immediately saw the wireless performance rating was poor. Further investigation showed both of my access points were on channel 11. Being remote, I couldn’t run a spectrum analysis, so I moved one of them to channel 1 and saved the configuration. That change didn’t improve the situation. I told her to move to the 5GHz SSID, assuming if it was an airtime contention problem, 5GHz would resolve the problem. It didn’t. Grasping at straws while occupied with work, I changed a setting or two which required a reset of the access point. I waited. And waited. And waited. Sensing something was wrong, I refreshed the page and got an error. I was kicked out of the controller and attempts to login again weren’t working.

I instructed my wife to reboot the server it was on. That didn’t immediately help. But 30 minutes later I was able to get into the controller. But instead of providing normal information, it was saying there were no devices on the network. After a few minutes it recognized the devices and showed they were online. Just before I left work for the day, I was kicked out of the controller again.

Fast forward two days and the problem hasn’t resolved itself. Everyone at the house has been without Internet for the time. Cellular tethering did provide an oasis in the desert but isn’t a long term solution. I return from the airport and can indeed confirm the Internet wasn’t working - or at least was very slow at best. I loaded the UniFi Controller on my laptop and did find the experience to be better on the LAN. Network performance on on my server, which is wired, was poor as well. apt upgrade downloads were running in the 10’s of Kbps. Speed tests at speedtest.net gave horrific results. To me, this points to a problem with the router or Comcast. Rebooting the router offered no performance gain. This now calls for a trek downstairs.

Almost immediately I saw my router’s Internet link light wasn’t on. It made me curious as an IP address was assigned. The RJ45 clip on the modem side had broken and was no longer holding the connection in place. Replacing the cable brought the link light back on. However, again, no performance improvement. Comcast’s modem problem detector was reporting a problem so I rebooted the modem. I get a text “The Internets working!” I felt like a hero, deserving of being hoisted on shoulders after a championship win.

Confident with the resolution, I ran an errand, came home, and started working on my laptop. With no apparent reason, performance slows. At the same time my wife says “Uhhh” and I say “Internet not working?”. No. Internet access wasn’t working again. I paid another visit to the rack downstairs to start troubleshooting.

First step was to reboot the modem and see if it improved performance. It did. Next step was to connect my laptop directly to the modem so there was no router in between. A speed test showed I was downloading at 150Mbps which is especially impressive considering I only pay for 75Mbps. I move the modem connection to the router, reboot the modem so it gives the router an IP address, and connect my laptop directly to the router. Speed tests showed good performance, this time under 100Mbps but still better than Kbps. Next test connected my laptop to the switch instead of router. Performance dropped to 30Mbps. Curious.

I conducted another half dozen similar tests, connected to the router, switch, or over wireless and running a speed test after each one. Results ranged between 30Mbps to 150Mbps with no true indication of what caused the changes.

As of now, this is where I stand. My wife is streaming a TV show, I’m writing this post and downloaded some new music on iTunes. All seems to be well. But I have no assurance it will continue to perform at this speed. I’ll post an update when I learn more.

Update

I had a conversation with Comcast and they said it was down because of a car accident and they were repairing the line. The support person wasn’t able to explain why it was down for days before. He suggested I call back after it was fully repaired. I did. The next agent told me my modem was bad and I should replace it. At the same time, she scheduled a technician to come over and look at the lines.

A day later I received a call from someone from Comcast. He said they had been having problems at their head-end location which was causing slow performance disruptions. When I asked when it started, he said it started Monday or Tuesday, which aligned with my experience. He performed a line test and said everything looks good.

My only wish is the technical support people had this information. Lessons learned?

  • Start at the Internet Provider unless something else changed.
  • Review the physical layer first, if possible.
  • No amount of remote access helps if the Internet provider is down.
  • A rigorous troubleshooting methodology is important.