Do you remember the scene from Apollo 13 where the NASA technicians were given the task of fixing the CO2 issue for the astronauts aboard the Apollo 13 using only the equipment they had in the space shuttle (a box, air filter, plastic bag, and duct tape)? Here’s the scene in case you haven’t seen it:
The ability to think quickly on your feet is crucial especially when under pressure and stress to deliver a fix in a certain time. Let me fill you in on how my Monday went.
I came in this morning and was immediately told that a Domain Controller/File Server at our office in Northern Virginia was offline. Sounds like a simple enough fix right? Get someone to power it on or remote into the Management Port (iLO2) and send the power on signal… tried that, but didn’t work. Instead a red health indicator LED flashed whenever the power button was pressed. Not cool! To top off the issue, this server also manages DHCP and the leases just coincidentally happened to expire for more than half of the users at this location… great…
After bouncing a few ideas off of my teammates, I came up with the idea of enabling DHCP on the switch stack at that location. Success! User’s were now able to obtain an IP address and access the company network/internet. As for the file server issue, once I arrived on-site with a server replacement I noticed that the hard drives in the current server were bigger (not in GB size, but in actual width and height).
The replacement server I brought with me was the EXACT same model as the server on-site that was down? This question may never be answered. Before I gave up all hope of repairing the issue on the same day, a light bulb turned on and the solution presented itself. I took the entire HDD enclosure bay out of the bad server, and placed it in the new server. I prayed for driver compatibility and on-board RAID management to successfully work when powering on this server and it did. After successfully logging into the server, verifying access to all drives and files, and ensuring DHCP was working on the server, I was able to stand down the temporary fix I implemented by enabling DHCP on the switch. I sent a quick email to the office asking users to reboot their PCs so access to their files could be restored and to retrieve a proper IP addressed issued through DHCP on the server. Once that task was done, I verified that everything was restored back to normal.
The ability to quickly think on my feet and common sense saved me. The resolution took about 3 hours and users were only partially impacted by not having access to their files. These victories give me a sense of accomplishment and further fuel my passion for the I.T. field!