It’s finally here! The MacBU has just released Service Pack 1 for Mac Office 2008. You can download the update directly from the Mactopia website (it’s large — about 180 MB), or launch your favorite Office app and select Help/Check for Updates. There are well over 1000 fixes and improvements in this release, including the return of custom error bars and axis tick manipulation in Excel charts. The full release notes are available online as well, so go check them out to see if we fixed your personal pet peeve.
I’m particularly excited to see this update go live, because it’s been the project consuming most of my time for the last several months. Shortly after we shipped released Office 2008 to manufacturing in December 2007, my manager asked me if I’d be willing to act as the development lead for the SP1 project. I agreed, and have spent many hours working on the project since we kicked it off in early January. Since our fine UA department has already written up the list of highlighted changes, I thought I’d spend a little time today describing how we approached SP1 and how we did the work.
First off, while I was the development lead for SP1, I most certainly cannot take any majority of the credit. SP1 was a MacBU-wide effort, with significant contributions from every single developer, tester, program manager, content writer, lab engineer, builder, and all the other disciplines I’m forgetting at the moment. I am incredibly thankful for all the hard work that everyone around me did.
So, why SP1, and why now? Well, it’s no great secret that we released Office 2008 later than our original plans intended. Even after that delay, there was a decent list of issues in the product that were postponed during the original release cycle that we knew we wanted to fix, so we knew that we had to plan for an update sometime in 2008. Mac Office 2004 was first released in May 2004, and its first service pack came out in October 2004, about 5 months later, and that’s pretty much the same schedule we came up with for Office 2008. In addition to fixing the issues we know about internally, we like to have a few months of real user data in hand (crash reports, newsgroup/forum postings, etc) that help us to identify problems that we didn’t know about. I’ve been spending a bunch of time in Ars Technica‘s Macintoshian Achaia forum over the last year or so, and have received a number of good bug reports from particpants there. (In fact, we have a ‘SWAT’ team of folks in the MacBU who participate in a whole variety of forums across the net.)
Anyway, the entire MacBU began development and testing work on 2008 SP1 in January. Initially, teams pretty much just took their list of issues postponed from the regular release and started fixing those bugs. Our test team once again dove into the product and found more issues that needed fixing, and we started to collect MERP logs (Microsoft Error Reporting Program — aka the dialog you see when an Office app crashes) from the wild. Devs took all those bug reports and began working on classifying them by severity and importance, and then started fixing them.
Like any release, however, there needs to be a modicum of control applied to the whole process. If everybody just keeps on checking in fixes, the product can actually end up being less stable than before. The Office codebase is large, and code changes in one part can have unintended consequences somewhere else, so we don’t let code churn run on forever. Many teams at Microsoft (and, I’d imagine, at other large software companies) have instituted ‘triage’ for bugs.
When we triage bugs, it basically means that we take a look at all the aspects of a bug before deciding whether to accept a fix it:
- How bad is the bug? (Is it a crash? corrupted file? data loss? simple redraw glitch?)
- How hard is it to encounter the bug? (every time on boot? Only when saving? On Feb 29? Only when dragging a shape across a page boundary when the page is in PubLayoutView and the shape has a semitransparent shadow and the document has never been saved?)
- Can the user work around the bug? (use the mouse instead of the keyboard?)
- How complicated is the fix? (simple typo (= instead of ==) or re-architecture of entire function?)
- Where is the fix? (in the middle of Excel’s recalc engine? Or for populating a single popup menu in one dialog?)
- How close are we to shipping? (Months, weeks, days?)
All of these factors come into play, and some combination of the magnitude of each of these leads to a decision of whether to fix a bug or not. It can get pretty complicated, and much of it comes down to a gut decision (although we try to collect hard data when we can, like ‘exactly how many MERP reports do we have of this one bug?’) In January, when we were 4+ months away from shipping SP1, pretty much any bug was allowed to be fixed. In April, when we were trying to lock down the code and minimize code churn in order to keep the product stable, we punted a known crashing bug because although the impact of the bug was large, the fix was a little complicated and the testing team didn’t think they could verify that the code change had no unintended side effects with the amount of time left before shipping.
So, what was my role in all this as development lead for SP1? I worked closely with one of our Program Managers and one of our Test Leads as the three-disciplined Sustained Engineering Lead Committee (I have no idea if we had some formal name, so that’s what I will call it here.) Scott, Pat, and I set the product schedule (kickoff, first day of triage, code-complete, zero bugs, release candidate target, etc) and acted as the final arbitrators in daily triage meetings, putting our roughly 33 (ok, now I feel old) combined years of software development experience to use. Mostly that meant that as teams brought their bugs to the triage meetings, we asked them about the above list of questions and tried to get a sense of how each bug fit in importance. We were responsible for using that information to ‘raise the bar’ each day/week — making it harder for bugs to qualify for fixing. That sounds somewhat anti-quality or antithetical to the purpose of a service pack, but actually it is a very important part of ensuring that we ship a high-quality, stable product for our users. It can be very hard to say no to a bug, particularly when people really are passionate about the specific problem or user scenario, but someone has to do it. The three of us all made a final call on each bug, and I think I tended to be the most hard-core about locking things down. Toward the end, a few people jokingly (at least, I hope they were joking) called me “Dr. No.”
As I said at the top, there are over 1000 fixes in SP1, including the re-addition of some features that were glaringly absent when compared to Office 2004. That doesn’t mean Office 2008 is now perfect — I know we haven’t fixed every bug in the product, and some folks are bound to note that we haven’t fixed some of the bugs they are most frustrated by. However, I hope you’ll agree that this update shows that I and the rest of the MacBU are committed to this product and to its future.
Update: I added a new post the other day about some installation problems people have been running into with SP1. If you are having difficulties with SP1 apps, please take a look and see if it helps you.