Mars Exploration Rovers Update: Opportunity Suffers Unwanted Computer Reboots, Hunkers Down for Reformat
Sols 3740 - 3769
Posted by A.J.S. Rayl
03-09-2014 12:19 CDT
Editor's note: As this Update was being prepared for posting, NASA announced that Opportunity had won extended mission funding to continue to explore the ancient clays on the rim of Endeavour crater. The Planetary Society congratulates the Mars Exploration Rover team for winning the opportunity for two more years of great science! --ESL
After setting the new off-Earth rover distance record in July, Opportunity roved on in August, driving south along the eastern edge of Endeavour Crater's western rim to Wdowiak Ridge on its journey to the next big destination, Marathon Valley. But as the Mars Exploration Rover (MER) began trying to conduct scientific investigations around Wdowiak Ridge, it suffered so many unplanned computer reboots that the team ordered the rover to hunker down and prepare for the process of reformatting its flash memory.
"We can't drive," said John Callas, MER project manager, of the Jet Propulsion Laboratory (JPL), the Earth home to all the American Mars rovers. "Because we're resetting so frequently, one can't conduct a meaningful science mission. That's why we're moving ahead with taking action and reformatting."
On August 29th, just as the long four-day Labor Day holiday weekend began, NASA/JPL announced the planned reformatting in a press release.
The spontaneous and undesired reboots, also known as resets, have been an issue involving the rover's flash memory drive that the MER team has been dealing with for a few years, but very intermittently. It would occur three times and then operations returned to normal for a year or so. Still, a tiger team had been assembled some time ago to explore all possible causes and the most efficient way of handling the reformatting on the veteran rover.
It was a forward-thinking move because the reboots increased substantially – to at least a dozen – in August. Worse, they began to occur one right after another. "We started seeing warm resets happening on almost every wake-up we used sequence control, and we began seeing, for the first time, back-to-back resets," said Bill Nelson, chief of MER engineering at JPL. "The CPU would crash, automatically reboot itself, crash again, and reboot itself once more, and then come up successfully the second time."
That's when the MER team had to begin drawing the proverbial line in the Martian sand. After commanding the robot field geologist on August 22nd to put all science activities requiring instruments on the instrument deployment device (IDD) or robotic arm on hold, four sols later, the team commanded their charge to park in place, effectively initiating a science stand down. The robot was to record only atmospheric opacity, or "Tau" measurements as the MER team calls them, to flash. With no master sequence running, the rover was basically to await preparation commands for reformatting.
"We did not abandon the rover to her own devices," said Nelson. "We have instead uplinked sequences for every Sol – just very tiny ones on/after Sol 3764/ August 26th – hoping to keep Opportunity under sequence control. We haven't always been lucky and after each warm reset the rover reboots in automode."
Since flash memory retains science and engineering data even when Opportunity powers down, the team could, but doesn't want to do without it. It is the same type of memory used for storing data on a thumb drive, or photos and songs on smart phones or digital cameras. Individual cells within a flash memory sector can become corrupted or wear out from repeated use. Reformatting clears all memory from the drive, identifies bad cells, and flags them to be avoided.
Just like reformatting your computer, it is something of a dreaded and ominous task, because everything in the flash memory is lost and what is needed must be reinstalled during the process. However, once the process is done, everything, theoretically, works just fine again. This, however, will be the first time Opportunity will undergo a flash-memory reformat.
The MER software engineers at JPL have been down this road before, sort of. About five years ago, they had to reformat Spirit's flash memory. The process went exceedingly well. But, as Callas pointed out: "Opportunity's symptoms are a little bit different than what we saw with Spirit, and we have to be cautious that we don't get our expectations wrong."
Spirit basically suffered from bouts of "amnesia events," where it failed to mount the flash memory when it woke up. Thus, all data that should have been stored was lost. Although Opportunity has had a small number of amnesia events, this robot field geologist has been primarily suffering sudden reboots when it tries to write to or save something in flash.
"The amnesia events that we have seen on Opportunity have all taken place in a wake-up, at roughly 23:18 or 11:18 pm local solar time, when we wake up to turn off a late APXS integration and prepare for DeepSleep," said Nelson. "During those wake-ups, the rover created a file system in RAM after it failed to create one in flash. Since there is no downlink during that late evening event, when the rover shuts down to go into DeepSleep, everything that was filed in RAM during that time period is lost because RAM is volatile memory. All the log files and other information that we would normally expect to see just goes – poof. Hence the term amnesia event."
But in Opportunity's case, since those events have all occurred during that wake-up timeframe following an APXS integration or chemical analysis, and because the APXS also conveniently records the data, the robot was able to get the data stored on the APXS the next sol. Rarely was any data lost. The constant rebooting, well – that's another story.
In spite of the double trouble Opportunity has experienced with its flash, the thinking for a long time now has been that reformatting flash will "cure" both the rover's amnesia and reboot events. "We suspect that it's the same root cause for both kinds of events – that the flash memory is wearing out – and that a flash reformat will correct the problem," confirmed Callas. "But we won't know for sure until we do it."
Since being stopped in her tracks near Wdowiak Ridge, not far from the crater called Ulysses that carves out a good chunk of the ridge's southern end, Opportunity hasn't been completely idle. The robot field geologist has been taking care of necessary operations, like keeping her eyes on the skies by taking routine Taus and working with her ground crew to ready herself for the flash reformatting.
The rover was slated to spend the American Labor Day holiday weekend downloading as much of if not all the science and engineering data still stored in her flash memory. Meanwhile, she has followed commands to switch to an operating mode that does not use flash memory, and to switch to a slower data rate during communication sessions, to add a little "resilience" in case of a reset during these preparations.
Other than the flash issue, Opportunity appears to be in fine health and really hasn't lost much time. Although the winds of Martian spring are beginning to blow, sending more of the planet's ubiquitous rusty red dust into the skies, and the Tau has been slowly fluctuating upward to hazier levels, the robot field geologist was still producing upwards of 675 robust watt-hours of power as August began to fade.
The MER team wants to get to its first prime Marathon target, The Spirit of St. Louis Crater, located just before or north of the valley, by early 2015. That schedule will give the scientists 100 sols or more to check the area out before having, likely, to park on a north-facing slope for the next Martian winter, which should begin to put the freeze on in October 2015. The winter solstice in Mar's southern hemisphere, where the veteran rover has been exploring, will fall on January 3, 2016.
Opportunity is currently taking the "preferred route" to the Spirit of St. Louis Crater, as Arvidson described it. "Using all the tools that we’ve developed for planning Curiosity’s routes, Paolo Belutta [rover planner of JPL] and I did a very careful calculation of how to get most expeditiously to Marathon Valley and right now the rover is way ahead of schedule," he informed.
As September sets in, the MER team is fully focused on reformatting. "The activity under way right now is to try to keep the rover under Earth control and decrease the amount of data in flash, because the big activity is going to be reformatting flash as quickly as possible," said Arvidson.
The team will present its plan to a review board, scheduled for Wednesday, September 3rd, at JPL. If the plan passes the board's review – general consensus is it will – the reformatting could begin as early as the next day, September 4th, confirmed Callas.
The process is expected to only take one sol. The team should know right away if reformatting has resolved Opportunity's problems, said Callas. Not really surprising considering the rapid and "curious" increase of reboots and sudden back-to-back events in August that motivated the team to take prompt action even before the month was over.
While the flash reformatting is considered "a low-risk process," because critical sequences and flight software are stored elsewhere in other non-volatile memory on the rover, and although "no one is panicking," said Arvidson, there is some concern about the robot.
For starters, Opportunity has been roving Mars, should anyone need reminding, for 10 and a half Earth years. She's an old girl by all accounts. Add to that the reality that it is currently about 200 million kilometers (about 124 million miles) from JPL, and the fact that finding anyone at SanDisk, the company that NASA-JPL licensed to provide the MERs' flash memory drives, who knows exactly how this old software works has been challenging to say the least.
"The procedure we're doing with Opportunity is straightforward and well defined and we've done it before with Spirit," said Callas. "At the same time, Opportunity is an old rover and there could be some complexities or complications. There is naturally a concern for that."
One complication could be the reformatting doesn't correct Opportunity's flash-induced reboots and amnesia events, and that is a possibility. "If the problems are intermittent, it’s possible that when you do the reformat, you test a particular sector, and on that particular test, it’s good. So you leave it in the memory pool, but it’s an intermittent sector and later on it fails," explained Nelson. "That leaves you with this failure in your memory even though you have reformatted."
Of course, the MER flash tiger team and the ops team have considered that and know enough to have determined the easiest way to find out if the reformat will work is to do the reformat. If it doesn't work, they already have back-up plans, other diagnostics in the queue that the engineers will use "to help illuminate what's going on," Callas said at month's end.
Despite the science stand down, Opportunity did complete all her science assignments during the first half of August. "We did some touch 'n go's on targets of opportunity on the Bench, then approached the northern side of Wdowiak Ridge, where we stopped and made some measurements, then tried to do a mosaic, but the reboots kinda put the kibosh on that," said Arvidson. "So we drove on, to the eastern side of Wdowiak Ridge to get nearer to a crater we call Ulysses, the next waypoint. Right now we’re sitting on the eastern flank of Wdowiak Ridge," he added, summing up.
The MER scientists are anxious to get to Marathon Valley, because data from the Compact Reconnaissance Imaging Spectrometer for Mars (CRISM), onboard the Mars Reconnaissance Orbiter (MRO), which looks for mineralogical evidence for water, indicate an abundant clayground with various kinds of clay minerals, thus more evidence of past water, just waiting for Opportunity's arrival. The buzz now among the MER scientists, Arvidson said, is about finding more Matijevic Formation, the most ancient Noachian rocks and outcrop, which the rover first set her eyes on in a flat, ligh-toned rock called Whitewater Lake in August 2012, on Matijevic Hill at Cape York and hasn't seen since.
The engineers also have their eyes on another "prize." By the time Opportunity reaches the Spirit of St. Louis Crater, the rover that loves to rove will have effectively completed the first marathon on another planet beyond Earth.
The rover made "good progress" toward that marathon achievement in August, logging 410 meters (about 0.25 mile) on its journey to Marathon Valley, said Nelson. All told, from her departure point at the southern end of Murray Ridge, Opportunity had, by mid-August, put half of the 2-kilometer (1.24 mile) trip in her rear view mirror.
"We've been hustling," said Arvidson. "But the terrain is going to get rougher as soon as we get back into operations."
For the last nine months, Opportunity has been driving southward, following the western rim of Endeavour Crater along Murray Ridge, stopping for winter in Cook Haven, and continuing along the ridge to the clayground nearer to the ridge's southern end, which lit up in CRISM data with montmorillonites, a soft phyllosilicate group of minerals that typically emerge in microscopic crystals and form into clay. Once the robot field geologist had finished characterizing the montmorillonite area, the team directed the rover to her next big destination, Marathon Valley, a 2-kilometer (1.24 mile) drive from that site.
As August dawned at Endeavour Crater, Opportunity was at an outcrop of opportunity called Cape Fairweather, located on or near the Bench. On Sol 3741 (August 2, 2014), the robot field geologist began two sols of in-situ science using the instruments on her arm or instrument deployment device (IDD). During the first sol, she collected the necessary pictures with her Microscopic Imager (MI) for a mosaic of the target Fairweather, and then, as is routine, placed her chemical sniffing Alpha Particle X-ray Spectrometer (APXS) for a multi-hour integration. The next sol, the rover repeated the observations on a second, offset target on Cape Fairweather.
"The [Cape Fairweather] measurements are consistent with Burns Formation materials," said Arvidson.
Burns Formation is the terrain that is, for the most part, the bedrock "ground" of Meridiani Planum, lying just beneath the "modern" windblown ripples and sand that blanket the region. It's the terrain that Opportunity spent the first seven and a half years of her mission roving over and was named after Burns Cliff, which was named for MIT mineralogist Roger Burns.
The rover studied the stratigraphy of Burns Cliff up-close while at Endurance Crater during the latter half of 2004, and found that its layers of sediment point to depositional environments that include a combination of aeolian and shallow water conditions, such as those commonly found in aeolian dune, sand sheet, interdune environments. The MER team later followed the layers in the cliff south of Endurance Crater and subsequently identified it as Burns Formation.
On Sol 3744 (August 5, 2014) Opportunity continued south for 86 meters (282.15 feet). The rover snapped the usual Navigation Camera (Navcam) and Panoramic Camera (Pancam) images after the drive for the MER team members to plan and/or confirm the rover's next drive on the journey to Marathon Valley.
The robot field geologist began her Sol 3746 (August 7, 2014) taking images of a Phobos moon transit observation. Then, she took a 72-meter (236.22-foot) jaunt south driving toward a really intriguing rocky ridge that the team named in honor of Tom Wdowiak, a much beloved member of the original MER science team, who passed away April 27, 2013.
A professor of astronomy and astrophysics at the University of Alabama at Birmingham, Wdowiak was a payload uplink lead and payload downlink lead of the Mössbauer spectrometers (supplied by Germany) for the MER Science Operations Working Group (SOWG). He brought decades of experience to the table, having studied meteorites, thermal springs on land, thermal vents in the deep sea, and global mass extinction events caused by asteroids or comets striking the Earth, all with Mössbauer spectroscopy.
"Tom was a really passionate, really enthusiastic, and a much-loved member of our team," said Squyres then. "When you lose someone like him, it hits pretty hard."
Although it had taken many months, the team finally found the right geologic formation to be his namesake. Interestingly, that same evening, after the drive toward Wdowiak Ridge began, Opportunity experienced a flash-induced reset that stopped all sequences. Was Wdowiak out there somewhere, as the musing went, and so taken with the honor that he just wanted a little more time to look on the ridge from afar and take in the view—or was it just a coincidence and a hint of reboots to come?
The very next sol, Opportunity was receiving a real-time activate command to restore sequence control and execute the next plan, a 2-sol touch 'n go. After getting a good night's DeepSleep, she woke up and collected an MI mosaic of a soil target of opportunity dubbed Icy Straight, and then put her APXS on the same spot for a multi-hour integration for the "touch" on Sol 3748 (August 9, 2014). "The soil target, a disturbed area in a wheel track, looks compositionally like windblown basaltic sands," said Arvidson.
On Sol 3749 (August 10, 2014), Opportunity took off, driving 100 meters (about 328 feet) for the "go", stopping mid-drive, as has become routine, to take some images. And the rover continued driving for the next two sols, heading south toward the northern edge of Wdowiak Ridge, putting 56 meters and then 33 meters (183 feet and 108 feet), respectively on her rocker bogie on Sols 3750 and 3751 (August 11 and 12, 2014).
As Opportunity was "bustling south in August along the Bench" – the flat, apron like area that surrounds Endeavour's rim segments that rise up from the terrain – it stopped mostly at targets of opportunity, said Arvidson, and found no Martian surprises.
Since the rover's ascent onto Murray Ridge in November 2013, the MER scientists have pretty much "totally" found and confirmed what they expected to find along the way based on orbital images and the ground data, he said. "We did measurements on what we thought was Grasberg outcrop and measurements on what we thought was Burns Formation outcrop," he said. "Grasberg is Grasberg. Burns is Burns. We found nothing out of the ordinary."
Opportunity's only truly excellent surprising discoveries since arriving at Murray Ridge remains the jelly doughnut rock called Pinnacle Island which the robot found serendipitously just in time for her 10th anniversary in January of this year, and the subsequent finding of Pinnacle's companion rock, Stuart Island.
That serendipity is leading to another science paper. "There’s a nice story developing on vein-filling fractures and changes in the fluid conditions over time from those discoveries," said Arvidson. But he wouldn't say much more. "It's still developing," he said.
Remember however, Pinnacle Island and Stuart Island were both found to have a very high manganese content. Interestingly, Curiosity has also seen manganese rich veins on the other side of the planet. Even more interesting is that on Earth manganese oxides are places where microbes evolve and thrive. "We’re not ready to talk about it," said Arvidson. Stay tuned.
On Sol 3752 (August 13, 2014), Opportunity bumped a meter (about 3 feet) to a rock target nicknamed Mt. Edgecumbe. The next sol, she collected an MI mosaic of the rock, and then placed the APXS on the same for multi-sol integration. That chemical analysis however was cut short to a single sol when the robot suffered another reset on Sol 3754 (August 15, 2014).
During the third week of August, Opportunity's flash-induced reboots began to increase dramatically. As always, the reboots stopped the rover's onboard master sequence and put her in a kind of safe mode. She was and remains otherwise healthy, noted Nelson.
The MER team's vigilance and timely actions effectively reduced the impact of the reboots on the rover's science and exploration. When another reboot happened on Sol 3757 (August 18, 2014), for example, real-time action from mission controllers reactivated the rover's sequence and Opportunity was able to complete the 48-meter (about 157-foot) drive planned for that sol.
Even so, "it got worse and worse," said Arvidson. The undesired reboots kept on happening and in back-to-back sequences, compelling the MER software engineers to expedite corrective action. So when yet another reset happened on Sol 3758 (August 19, 2014), the team began having Opportunity suspend her science assignments until the flash-induced reboot problem was corrected.
Meanwhile members of the flash tiger team that has been investigating the reboots for more than a year began putting its finishing touches on a plan for reformatting. Opportunity's reboots had long been thought to be the result of corrupted cells or sectors in the memory, something many of us have experienced at one time or another on our computer, camera, or smart phone.
As the investigations have continued, it appears they have winnowed down the cause. "Worn-out cells in the flash memory are the leading suspect in causing these resets," said Callas.
"That means these cells are no longer holding data like they're supposed to," said Nelson, at the risk of stating the obvious. "And because of the increased frequency and also the increased severity – the fact that the rover is experiencing resets back-to-back, we decided that it was time to do a flash-reformat," he added.
The MER team has good reason to believe reformatting will fix Opportunity's reset and amnesia events. "All the [events] for which we have information as to where the problem has occurred in flash, it has always occurred in one bank of flash memory," said Callas.
"We’ve seen all these errors only in Bank 7 of the flash memory," Nelson elaborated. "We have 8 banks in the flash memory, numbered 0 through 7. Since Bank 0 is used for flight software, it is not part of the reformat," he pointed out. "Obviously you wouldn’t want to reformat the place where you’re holding your operating system. But the other 7 banks, which are devoted to storing files and are used as a file system, will be reformatted. We’re going to try to reformat, doing nothing special, and hope that it takes out a few, maybe 4 or 5, sectors from Bank 7 and declares them bad. Hopefully not much more than that. But we’ll see," said Nelson.
Each bank, for the record, is composed of four 'packages' with each package being 32 sectors of 64 kilobytes, Nelson informed. A bank, therefore, has 128 sectors. [In this context, 'sector' is not the same as a disk, including a flash disk, sector, he noted.]
The size of the rover's flash file memory capacity will be reduced by the number of bad sectors that the reformatting finds. "We have a few cases where more than one location in a sector is flagged as bad, but most errors are in unique sectors," Nelson pointed out. "During a reformat, an entire sector will be marked bad even if only a single location within it is bad. That's simply the granularity of the system."
During the reformatting of Spirit's flash, the team lost about 7% memory, because it was declared bad, remembered Nelson. "We don’t know what’s going to happen this time. We strongly suspect, but we do not know, that this reformat will fix our problems," he said. "We just don't know yet how many sectors are bad."
If this reformat effort fails to fix Opportunity's resets and amnesia events, then the team's software engineers could get rid of Bank 7 altogether, but that would mean the rover would lose a greater amount of memory than the team would like. "There is a way we can program the flight software to believe it only has 6 banks of memory, effectively disabling that 7th bank," said Nelson. "The bank will still exist, but the flight software will ignore it and will only use the remaining 6 banks. But we lose about 14% or 15% of our memory that way," he explained.
The MER engineers, as of post time, hadn't fully tested taking out Bank 7, and it is optimistically hoping it won't have to lose the entire bank, because that’s a relatively large hit. "But if the format that we’re doing now doesn’t work," Nelson said, "that is the next step."
During the last week of August, Opportunity hung tight, though probably was bored out of her mind. Most of the action was taking place on Earth.
On August 24th, Opportunity's Sol 3762, MER engineers activated a new communication table on the rover to ensure predictable communication for the next several weeks. Due to the complexity of the frequent resets hitting during high-gain antenna passes causing subsequent X-band faults, the team subsequently sent a real-time command of a special sequence August 26, 2014, which converts the next several X-band passes to the low-gain antenna.
In addition, the MER ops team has sequenced a checksum test of the lower portion of flash to get some data on the physical heath of the flash memory chips in general.
The next step in the plan is to boot Opportunity into a mode that does not use the flash file system. "That is simply forcing the rover to use RAM instead of flash memory," explained Nelson. "This is similar to the amnesia events, except the switch to RAM is commanded rather than autonomous, and the rover stays awake until it can downlink the RAM contents." The point, of course, is to enable the team to confirm the health of the rover independent of flash.
During the Labor Day weekend, Opportunity is slated to be working on the return of all the remaining science data in her flash drive that she can. "When you reformat, you lose everything stored in flash, and we're hoping over the holiday weekend to clean out the rest of the science data products that we want to get down," Callas confirmed.
As August wound down, Opportunity, with 40.69 kilometers (25.28 miles) on her odometer, was thermally stable, power positive and producing upwards of 675 watt-hours, and communicating well over both over X-band with the Deep Space Network and via UHF relay with Mars Odyssey and MRO.
In a perfect world and if all goes as planned, the MER engineers will reformat the robot field geologist's flash memory on September 4th and Opportunity will be back on the road shortly thereafter.
The good news is that they believe they know the reformatting will solve Oppy's issues and they will learn "pretty much right away," Callas said, if it worked. "You may have experienced this with the flash in your camera: the flash memory chip gets corrupted and you have to reformat, and once you do everything's fine after that. It could be as simple as that," he said. "That's essentially what we saw with Spirit."
While the MER team is optimistically confident, they are also cautious, keenly aware that "anything could happen and go wrong," as Callas put it.
Still, optimism is what "drove" the MER team and their rover to this point. If ever there was a team that believed in its spacecraft, it's this one. Besides that, there's just too much to see and do and discover at Marathon Valley.
"It’s a pretty shallow valley that doesn’t cut very deeply into the rim, but begins on the western side and extends into the crater proper, and what it exposes is different material," said Arvidson. "We can see from the HiRISE imaging data that there are rocks on the walls on either side of the valley, striking east/west. Then, the bottom of the valley is cracked and crinkled and bright, presenting a different stratigraphic horizon. Once we get there, we’ll look at the floor, which expose the oldest rocks, and then check out the walls and look at the stratigraphy," he said. "The big question on everybody’s mind is whether or not the valley floor is the equivalent of the Matijevic Formation."
This Marathon clayground, actually, was a major selling point for the MER's application for another mission extension. Although there was no word in August on whether the mission will be approved so Opportunity can rove on, word behind the scenes is that an announcement should come in September.
Undaunted, the MER team looks to a bright future. The first hurdle is getting through the reformatting. "We have to do a full-up test, an Operational Readiness Test or ORT, on the test bed rover, where we will run all the sequences and do all the other things that we expect to do as part of the reformat on Opportunity's flash on Mars," said Nelson. "We show ourselves in this test that everything will work as expected before we do anything on the spacecraft."
"There’s not a lot of anxiety," said Arvidson. "We know what’s going on and we’re going to try this. We have a pretty good indication based on what happened with Spirit that we can fix it. If not, the engineers already have several other things that we can pursue."
Once the reformatting is complete, and provided it is successful, the plan roving forward is for Opportunity to continue on a "beeline route" to Marathon Valley, said Arvidson. At this point, Opportunity is "way ahead of schedule," he said.
"To maintain schedule, we should be pulling up to the Spirit of St. Louis Crater sometime early in 2015," Arvidson continued. "We want to have a buffer of at least 100 sols or more so we can explore the valley before we have to settle in for winter on a north-facing slope, if needed, based on energy considerations."
If the reformat works as anticipated and Opportunity keeps up the pace – unless she finds something stunningly Martian and unusual – the rover will probably arriving there well before the winter holidays grip consumers on Earth.