It's quite the challenge being the IT guy for a local church
At our local church, technology is used extensively in conjunction with the audio setup. These are co-located on the AV desk that is typically managed by two people. One for audio, one for the computer. This is used to ensure lyrics for the worship songs, church notices and sermon material can be shared on the large projector screen at the front situated behind the band or pastor.
How hard could it be to join the Rota?
I decided to help them out and join the Rota after a significant drop in the number of people who could perform the role. As a techy I wasn't at all phased by this, but after doing a training session and shadowing a few people to learn the ropes of how things are setup and it became clear that the role was highly stressful. Often reducing people to tears when things went wrong.
Typically, it's such a seamless and almost invisible part of the church services, but when it goes wrong it's very noticeable, many heads turn to look at the desk if the slightest issue crops up. The lyrics need to be switched at almost millisecond like precision. No one ever wants a tech problem at a funeral service.
After soloing the service for a number of months the system was sluggish, crashing all the time and needed quick reaction times and fault finding to keep on top of things. On the whole it was very unreliable and on the days I wasn't on the rota, I would often get a flurry of messages full of desperation to help fix issues and I felt powerless to help. It didn't take long for me to get to the point where I wanted to quit myself.
I thought about it for a while then made the opposite decision and decided to use my god given talents to try and tackle these issues even though I wasn't the IT guy for the church, that role didn't really exist and the entire setup is a result of many enthusiastic helpers over many years, many who have since left.
Fixing the hardware
The computer is pretty old, an Intel i5-7500 CPU with an old school HDD and 8GB of DDR4 RAM. Boot times were shocking taking around 4 minutes, loading software such as Song Show Plus took minutes too. It was awful to use. When things crashed which they often did, it would take an uncomfortable amount of time to reload the software and hope things would be working again.
I took the computer home and after cleaning all the dust out of it and taking stock of the hardware I looked in my box of discarded parts to see what I could possibly use. I had two sticks of 16GB DDR4 RAM so I popped those in which made a notable difference.
I also found an SSD to pop in but the capacity wasn't big enough to perform a disk to disk image, I really didn't want to do a full reinstall of everything as I wasn't comfortable with all the different things on the computer that have accumulated over time. I decided to order one at my own cost (a whopping £25) to arrive the next day, a quick image from one drive to the other using disk duplication software with partition table setup and the machine booted first time. I decided to leave the old drive in the machine but disconnected in case I needed to revert to it at any point. I jumped in the BIOS, checked all the settings, made the new SSD drive the 1st boot device and enabled fast boot. New boot time 20 seconds from cold to fully up and running. Bingo!
Fixing the software
When logging into Windows I was greeted with many things auto starting, Spotify, Power DVD software, Song Show Plus, Edge, Skype, OneDrive, and Google Drive to name a few. This added overhead time to close or minimise the things, stole valuable resources from other applications and was just annoying. I disabled all these. Logging in felt much snappier.
I cleaned up the desktop removing unnecessary files accumulated over years to an archive folder. A trip to windows update, turning on all the security settings that had been turned off and turning on auto update resulted in a lot of updates and reboot cycles, but at this point things started to feel very snappy indeed, especially for an old machine. I installed the awesome TailScale so I could create a VPN connection to the machine from my laptop and also installed the awesome UltraVNC to be able to remote control the machine; this would allow me to support the machine if people were having issues and I wasn't onsite or nearby.
My work here was done, I was delighted by how quick the machine was and fully anticipated an end to all the stress and anxiety relating to using the PC to support the church services. I was wrong, very wrong. The problems continued, an investigation into the network was required.
Fixing the network
The machine seemed fine for extended periods of time, but every now and then Song Show Plus would not start and OBS would crash while streaming the church service to YouTube which requires a stable connection to continuously upload the video stream. It didn't take long to see that the connection to the network wasn't present and then would pop back into life. Super strange. I naturally replaced the network cable which was looking worse for wear. Nada. I tested the socket on the wall with a network cable tester. All good. I then decided to map the current network infrastructure to see if there was anything obvious but it looked fairly standard.
Those network over powerline extenders needed replacing with a cat5e cable but nothing out of the ordinary there. I also found it impossible to UltraVNC into the machine from home for anything longer than a few minutes, but an immediate reconnection was fine, for another 5 minutes or so.
I then patched the network cable from the PC directly to the Virgin Media Router, things seemed stable, but then more issues started happening. Strange. At this point I was scratching my head. I decided to ping the router for an extended period of time and could see the following which indicated the router was losing connectivity to the internet or experiencing intermittent connection issues. The DNS settings of the router were standard and supplied by Virgin Media. More head scratching.
An exploration call to Virgin Media resulted in them saying they could see an issue, or more so they couldn't see the router on their network at all, hurray! Almost 10 days later after various engineer visits with wrong router installations which was a saga in itself; I had a new router in place with everything checked by VM and working.
The next Sunday Service came along, I wasn't onsite, I fully expected everything to work out fine, then the messages of panic started. The issue was still there. The PC kept losing it's network connection, the live stream was dropping despite being on a direct cable to the router, I felt awful for the team.
I ran a second cable, brand new, tested for a long duration, the issue still kept coming back. I was puzzled. I was starting to think there was a rogue device on the network kicking the machine off, typically Windows would inform us of a conflicting IP address though. Then it I had the realisation the router was replying saying it couldn't reach the destination so it had to be a problem with the router.
At this point I started to become suspicious about what was on the network, it's a large building, and even though the diagram looks simple enough, I had no idea where all the cat5e cables going and what other devices could be interfering with our connection.
I wrote a quick PowerShell script that would capture the mac address of the router by repeatedly looking at the arptable and appending it to a file every 5 seconds. The arptable caches the hardware address of a network card and maps that to an IP address. It allows switches and other networking equipment to know which physical network devices to send intended IP packets to. I left the script to do it's thing. Boom, another device was claiming the Virgin Media router's IP address, to which it quickly reclaimed. Whatever the device was, it wasn't as quick at reclaiming from the Virgin Media router which gave is long periods of a workable internet service but not stable enough for the Church Service.
Finding the device wasn't an easy task, but using an online MAC Address lookup service I could see it was a Netgear device I was looking for. Was it one of those switches? Nope, I removed them and the problem persisted. Eventually I went into every room and traced from the power sockets to try and see what devices were on the network and boom. I found it. Someone along the way had plugged in an old internet router thinking it would behave like a switch, as it was a router it too was trying to claim the often used default IP Address of 192.168.0.1 causing a conflict with the Virgin Media router. It was well hidden under the desk and behind the PC in the office for the purposes of giving the printer connectivity. It worked when whoever installed and tried it and as the issue it introduced wasn't immediate and permanent it went unnoticed for months if not over a year causing all kinds of chaos.
I took the opportunity at this point to clear up things and not reconnect the switches I had removed while diagnosing the issue. I'll be running a new cable to remove the powerline devices.
Conclusions
My conclusions are that well intended enthusiasts don't always appreciate the impact of changes they are making. The network architecture was way too complicated for what it needed to be. Keeping things as simple as possible is typically best.
It also made me appreciate how difficult this was to find, I have decades of experience in these areas and it took a lot of first principle thinking to get to the bottom of this with lots of driving back and to from home to the church. I am sure if they had gone to a professional they would have ended up buying lots of new network equipment they didn't need, new computers and no lessons learned. I feel privileged to be able to help get things back up and running and more stable, I'm praying that the many Church Services planned for Easter go without too much trouble.