Tuesday, May 25, 2010

Summer Goings-On

So, some of our users have been concerned about the planned downtime this summer. Here's the nitty-gritty:

We're rebuilding everything from the ground up. Partly this is because the current system has things scattered over multiple servers, a relic of the days when we didn't have powerful enough systems to consolidate mail, for example, onto one machine. Partly this is because Oracle (formerly Sun) can no longer be relied upon to provide free security updates for Solaris, which forms the majority of our back-end. Finally, this is partly because the current setup is simply untenable. It's the computer equivalent of an Land Rover held together with spit, baling wire, and chewing gum - it runs, more or less, but God help anyone who needs to poke around under the hood when things break.

So, we decided to move on. Rather than spending time and energy trying to keep fixing a 20 year old heap, we're starting fresh. We'll be changing some things on the backend - most notably a migration from Solaris to Debian and FreeBSD - and we'll be changing some things on the frontend - like bringing in Windows 7 and getting some new hardware for the Linux clients. For most of our users, the change will be mostly transparent. For some of our users, things will change a little bit. For a tiny minority, things will break. To those people, we apologize in advance, and we are, as always, happy to help you get things working again.

In theory, while we're making this transition, we'll be building replacements side-by-side with the current production servers, and swapping them out once we're fairly certain that everything's working correctly. So, for the majority of our services, there won't be much more than a blip in service. Moreover, we won't be swapping out more than one server at a time, so no more than one service should go offline at any given time. However, there are some services (notably web and MySQL) which will take longer to swap out. The fact of the matter is that we really only have one server powerful enough to be a web server, so we can't build it's replacement until we've shut it down. Even so, our daring team of sysadmins should have the server back up and running in no time flat (I believe the previous record for a ground-up rebuild of the webserver was less than a day).

Finally, I'd like to bring some attention to the "in theory" that started off that last paragraph. As anyone who's ever worked on any sort of project before knows, something will always go wrong. So we ask you to bear with us while we work out the kinks and the bugs. This is going to take at least a few weeks, and things will be a little hectic during that time. We may swap in a new system only to find some bug that escaped our testing, and we'll have to switch back until we get it sorted out. We'll do our best to post here when we're getting ready to swap something out, and we'll make sure that there are always avenues open to get in touch with us to let us know about problems.

Oh- and remember that we're all volunteers. We do our best, but sometimes other commitments (school, work, family, life) can get in the way for a little while. But like all true geeks, we can't stay away for long, so rest assured that things will get fixed, emails will get answered, and the agents of truth and light will win the day.

Thanks for reading.