I just got off the phone with Phoenix mission manager Barry Goldstein, who filled me in on what's been happening with Phoenix over the last few sols. In a nutshell:
- The sol 22 anomaly is now understood; it was a two-part bug, the first part of which was fixed with a software update uplinked yesterday, so they may allow the scientists to start using flash again today (in their plans for sol 30).
- They now think they understand the problem with the TEGA doors, that it is a mechanical problem, that an assembly "was not fabricated to flight specifications." However they still think they can get samples in, and plan to try it on sol 30 or 31.
- The spacecraft went into safe mode and had no operations on sol 27, but they recovered fast and were back to normal on sol 28.
- They will probably deliver the first sample to the wet chemistry lab nextersol (sol 30).
- Analyzing a second TEGA sample is the last item on the list for minimum mission success, so that should be done by the end of next week, and they're on track for full mission success by the end of July.
When they opened the first set of TEGA doors on sol 8, one door (the one on the end) opened fully, but the other only opened partially. Then, on sol 25, they tried to open the adjacent set of doors, but the attempt was "mildly unsuccessful," Barry told me today. "The doors exhibited a very similar problem to the first set of doors. Except in the first set we had a problem with one side and not the other. On this one we identified a problem on both sides. The team spent the weekend looking at that. Yesterday we had a meeting, and we believe we found the root cause. The problem is actually related to a mechanical assembly that was not fabricated to the flight specifications, and it's causing an interference with the doors." He said that he suspected that the doors on the ends of the instrument will open fully, but the middle doors will all only open partway. "The good news is we've done tests in the lab in the configuration of the doors that opened on sol 25, and we've proved we can deliver soil to that, so we're going to take that step on sol 30 or 31." Which is, indeed, very good news, I told him, and he said, "It was problematic, and it could've been better if we didn't have the problem at all, but we're working around it."
There was a hiccup over the weekend: "Phoenix went to safe mode. We were uplinking sequences for [sol 27] on Saturday night and we basically had what we call a 'sequence collision.' where a sequence that was running in the background -- our background master sequence -- had spawned a sequence number with the identical sequence number that was being uplinked for packet deletion. Packet deletion is where we confirm to the vehicle that we've received packets with specific packet ID numbers, and then the vehicle knows it can delete them permanently. So what happened is the system basically terminated the background master sequence and that caused us to enter safe mode. We diagnosed it very quickly, we replicated the problem on the ground, we knew what the fix was, we knew how to egress out of safe mode, so within 12 hours of detection we'd recovered and were out of safe mode but the bottom line is we lost one day's worth of operations. We never executed anything that sol." So sol 27 was the second sol of the mission during which Phoenix was entirely inactive. The first was sol 23, when they stood down because of the file system anomaly that happened on sol 22.
Here's an update on the sol 22 anomaly, which Barry told me was a result of two bugs colliding, one a known bug and one a previously undiscovered bug. I can't say I exactly understand the details here, so rather than attempt to interpret what Barry was telling me as he was evading "really crazy drivers" in Tucson, I'll just post what he said here and let you all come to your own conclusions.
It was a problem we'd identified a while ago and we were starting to work a fix for it. It was associated with when we saved when we go to sleep at night, the way we save the packet sequence numbers in the file system and what's supposed to happen is we're supposed to mask off the lower 12 bits, and what happened was we had identified that and had started working a patch to fix this, we knew the symptom, when it happened it would generate duplicate packet sequence numbers. We knew the system could operate that way but we were worried about what would happen, all the permutations. So what happened on sol 22, we actually had one of those issues occur where we basically generated duplicate sequence numbers. It just so happened that morning when we uploaded the sequence for that morning we included those same packet deletes, we do that every morning. And we deleted just enough packets such that because of the other problem we ended up having the file system configured where there were two consecutive packets with the same ID. If we hadn't sent up that exact number of packet deletes this wouldn't have happened. When we did that, we had an unintended consequence. It normally shouldn't happen, if we had corrected the masking issue it would not have happened, but when we ended up with two packets with the same sequence number, our team went to work looking at it, we found a bug in the code that generates packets that if that happens, you end up getting into an infinite loop generating the same packet ID. So as you recall we generated over 45,000 packets with the same sequence number, so because of the first bug we generated a condition where the second bug was exposed.
I asked Barry to put the various issues into perspective -- how did he feel about how Phoenix was proceeding so far? He said "If you look at surface missions in the past, they've always been fraught with operational difficulties. Because of that, when we planned this mission, we had a heuristic, a ground rule that I set in place, that said we'll have one day of margin for every two days of operations. So we have a 90-sol mission, and we believe within 60 sols we could meet full mission success. We are now getting ready to plan sol 30, and we've lost only two sols. So our average is pretty darn good. Even on the sols that we don't get a sequence uploaded, or things don't go right, we're able to quickly recover and do something. Whatever that something else is, it's all pushing toward our full mission success criteria.
"So I think we're doing phenomenally well, I really do. I have no doubt in my mind that we'll have minimum mission success by, say, early next week at the latest. The reason we haven't had that done is because we've held up on the TEGA, that's the only thing we need left to do is one more TEGA sample. And I'm fully confident that by the end of July we'll have full mission success."
In fact, Phoenix is doing so well that NASA has already funded a one-month extension to the mission, taking it through the end of September. Barry hopes to do more; he said they're working on a proposal for funding that would take them through the end of November. That's when Mars enters solar conjunction, and communications with Mars spacecraft become very difficult. (The Mars Exploration Program Analysis Group calendar has the period of poor communications due to conjunction lasting from November 18 to December 24.) Barry said, "What I've told Peter [Smith, Phoenix project scientist] is that we should do a proposal with an option, where we propose through November because we're confident we can survive until then, and the option will depend on what the energy state looks like as we get close to November."
Finally, I'll note that as of sol 28 the mission seems to have completed and downlinked all of the images for the first, reduced resolution, complete color panorama of the landing site, what the team is calling the "Peter Pan." None of the amateur image magicians has assembled this color pan yet, or I'd post it here; I'll keep my eyes peeled and post it when I find it. I've updated my Phoenix sol-by-sol summary to today.