Pulses, Triplets, and Gaussians: Rescoring the Reobservations
It has been more than a year since the SETI@home crew spent a hectic week at Arecibo, pointing the giant radio telescope at some of SETI's most promising targets. Much of the data collected during the reobservations has since been repackaged as work units, and sent out to users around the world for analysis. By late summer 2003 the processed work units had been returned to SETI@home headquarters in Berkeley, where Chief Scientist Dan Werthimer’s team has been working hard to sort it all out and figure out what it all means. Most of the results are now in, but project scientist Eric Korpela is still refining the algorithm used to score the candidates and determine how likely they are to be extraterrestrial signals.
Gaussians are the power curves produced when the Arecibo beam scans a steady celestial radio source. The signal is weak at first, strong when it is at the center of the beam, and then fades again. This produces a bell shaped power curve known as a gaussian (see below).
Pulses represent any celestial radio signal of a fixed frequency that is distinguishable above the background noise.
Triplets are a sets of 3 equally spaced pulses. Whereas gaussians represent a constant signal from space, triplets may represent a series of pulses transmitted at fixed time intervals.
Why Rescoring is not Easy
During the reobservations Werthimer’s team pointed the Arecibo radio telescope at 226 different locations in the sky. Most of these were SETI@home candidates, selected from among the billions of points in the sky where the SETI@home program detected something unusual. Only points where a detection had been made more than once qualified as candidates to be revisited during the reobservations. A single detection, however clear and strong it was, could not be included. This criterion, added to the three types of signals detected by the SETI@home program, created four different types candidates for the reobservations sessions: gaussians candidates, composed of two or more detections of gaussians in the same location in the sky; triplet candidates, composed of to or more detections of triplets in the same location; pulse candidates, composed of two or more detections of pulses in the same location; and “metacandidates,” which are composed of two or more different types of detections (i.e a gaussian and a triplet) at the same location.
Korpela’s task was further complicated by the fact that these four types were not the only distinctions made among the candidates. SETI scientists have always known that a signal emanating from the stars will most likely not be received at the same frequency in which it was transmitted. This is because of the Doppler effect, which would cause the frequency to vary depending on whether the planet (or other body) from which the transmission originates is moving towards or away from the Earth, and at what speed. Since the speed at which the alien’s home planet is traveling relative to the Earth is likely to vary constantly as both planets orbit their stars, it is almost certain that the frequency of the received transmission will drift either upwards or downwards. The SETI@home program installed on users computers is therefore specifically designed to look for signals in drifting frequencies.
Some SETI scientists, however, have argued that the highly advanced aliens who are likely to build an interstellar radio beacon would most likely compensate for the motion of their planet, and vary their signal’s frequency accordingly. If SETI scientists on Earth would simultaneously compensate for the Earth’s motions, then the signal would be received at a fixed and steady frequency, the same one in which it was transmitted. This correction involves calculating the reception frequency as if the signal was received at the center of gravity (or “barycenter”) of our solar system, and is therefore called the “barycentric frequency.”
Unfortunately it is very hard to guess what aliens would do, and whether they would oblige us with a barycentric correction or not. The SETI@home team therefore decided to proceed on two simultaneous tracks. Accordingly, each candidate signal is evaluated twice, once with a barycentric correction and once without a correction. In the first instance candidates are only considered if they are steady, i.e. they remain within a very narrow frequency band with hardly any drift. In the second instance candidates are allowed to drift over a much wider frequency band, as would be expected of a non-barycentrically corrected alien transmission.
The end result is eight different types of candidate signals that were targeted during the reobservations. There are four established categories - multiple gaussians, multiple triplets, multiple pulses, and metacandidates, and each of the four scored twice - with a barycentric correction and without one. For each of these eight types Korpela and his colleagues must determine whether the reobservations confirmed or failed to confirm the presence of a signal from that location. In the first case the candidate’s score would rise; in the second case the score would fall.
Predictions vs. Reality
When analyzing the data, the SETI@home team fully expected that the score of the vast majority of the candidate signals would drop as a result of the reobservations. After all, even the most optimistic SETI enthusiast would admit that the only a very few of the 226 observed targets would prove to be a true alien transmission. The rest of the signals, and quite possibly all of them, would prove to be the result of random noise or radio frequency interference. If such noise had been detected at a particular point in the sky in the past, even more than once, there is no reason to suppose a signal would be found there the radio telescope points in that direction once again. It is far more likely that that point in the sky would prove completely unremarkable, and the candidate’s SETI score would therefore drop. Based on a statistical analysis, Korpela estimated that there was only a 10% chance that even one of the 226 signals would have its score raised by the reobservations. Unless, of course, one of the candidates proves to be the “real thing”.
When the results of the analysis started coming in, it turned out that the initial estimate was almost entirely correct: The scores of all the triplet and pulse candidates, both barycentric and non-barycentric, had indeed gone down. The same was true for the barycentrically corrected gaussian candidates, where only one suspected signal saw its score rise. The startling exception was the group of non-barycentrically corrected gaussianss, where 38 of the 42 candidates saw their score rise rather than fall. What was going on?
The problem of answering this question fell to Eric Korpela, and after some hard work and close analysis he had the answer. It turns out that the main criterion for evaluating all non-gaussian signals is their strength – how much they rise above the background noise. This is especially true for pulse candidates, for how would one evaluate whether a pulse is a signal or nothing at all if not by its strength? It is also true for triplets, where the strength of the spikes is a crucial factor in evaluating whether they are potential signals.
Gaussians, however, are the exception, because they are evaluated according to their shape rather than according to their strength. The closer a candidate fits the mold of a prefect gaussian – i.e., the shape of a steady signal emanating from outer space – the higher it will score. SETI@home uses the “Chi square” method to determine how closely a candidate’s pattern fits the ideal gaussian mold, and scores it accordingly.
Focusing on the signal’s shape rather than its strength is, on the face of it, a very reasonable choice: a good gaussian shape in itself practically guarantees that the signal is being received from the stars, while there is no guarantee that a true alien signal would be particularly strong. Unfortunately when it came to the reobservations, Korpela found, the reliance on shape over strength proved to be a problem.
On the Scoring of Gaussians
Here’s why: when the giant and highly sensitive Arecibo radio telescope is pointed at a particular point in the sky, scanning it closely, it is almost certain to hear "something." Mostly this is merely inconsequential noise that can and should be ignored. During the reobservations Werthimer and his team detected many such ghostly "signals" while scaning the areas around promising candidates. When these candidates were pulses or triplets, the SETI@home crew easily dismissed the false detections as noise, which barely rises above the general level of the background radiation. The pulse and triplet candidates’ scores consequently fell.
But when Werthimer and his team pointed the telescope towards locations in the sky where gaussian candidates had been detected, the results were different. That same “something,” which was easily dismissed in the other cases, was now evaluated according to the criterion used to score gaussians. Since the strength of the signal did not now enter into account, the fact that the supposed “signal” was hardly more than a whisper above the background noise did not matter. The shape of the signal did matter, and since it originated in space it was likely to be identified by the SETI@home program as a gaussian of some sort.
The result of this misidentification is that gaussians appear to have been detected in the same spot three times in a row. Twice in establishing that particular location as a possible candidate signal, and once more during the reobservations. In other words, as a result of the reobservations, the score of the gaussian candidates was very likely to rise.
While this happened to the vast majority of the non-corrected candidates, the barycentrically corrected candidates were relatively immune to the false detection of ghostly signals. This is because the corrected signals focus on a very narrow band of frequency, where even a faint “signal” may not occur at any given time. For the non-corrected signals, however, SETI@home scientists were looking at a much wider band of frequencies, where the detection of “something” was highly likely. The result is that while the score of only one “corrected” candidate rose after the reobservations, the scores of a vast majority of the non-corrected variety also rose.
What to Do
Now that Korpela has identified the problem, he is working hard on a plan to correct it. What is needed, he explains, is a way to reinsert the strength of the signal as a parameter for evaluating gaussian signals. While he is confident he will succeed, he also concedes that this is no easy nut to crack.
Part of the problem is that even among the gaussians, not all are evaluated in the same way. While preparing the data for the reobservations, SETI@home scientists noticed a relationship between the a gaussian’s score, and the speed at which the telescope was scanning the skies (the “slew rate”) at the time. The greater the telescope’s slew rate, the higher the gaussian’s score is likely to be. This, they quickly realized, was because when the telescope was moving swiftly, it made a smaller number of point measurements in the area of the supposed signal than when it was moving slower and spent more time in that region. Naturally, it is easier to fit an elegant gaussian shape on a small number of measured points, than on a large number of points, and therefore the candidates measured at high slew rates scored higher than those measured at slow rates. In scoring the candidates for the reobservations sessions, Korpela and his colleagues compensated for this effect by artificially lowering the score of high slew rate candidates.
While this correction resolved the slew rate issue, it makes it harder to solve the newly discovered signal strength problem. But difficult or not, Korpela knows that a solution will be found.
So where does the analysis of the reobservations stand at this time? Out of the eight types of candidates, five have now been completely analyzed and scored. These include the pulse candidates, both barycentrically corrected and not; the triplet candidates, both corrected and not; and the barycentrically corrected gaussians. The non-corrected gaussians still await the resolution of the signal-strength problem, and the metacandidates await the resolution of all the other types of candidates before they too can be scored. So overall the SETI@home crew has made good progress, but is not yet done.
That Lonely Signal
And what of that lone barycentrically corrected gaussian whose score had increased following the reobservations? Among the five categories successfully analyzed so far it is the only candidate whose score has gone up, and it therefore deserves special attention. Nevertheless, cautions Werthimer, it does not appear likely to be a true signal from extraterrestrials. This is because even though the signal was detected in the narrow frequency bands required of barycentrically-corrected candidates, it was, nevertheless, not stable but quickly drifting in frequency. This would take it out of the narrow observation band in a few seconds, “so that if we had looked in that part of the sky even a few seconds later, we wouldn’t have found a match,” said Werthimer. Nevertheless, he added, it is an interesting signal and SETI@home will keep an eye on it.
While completing the analysis of last year’s reobservations sessions, SETI@home is also planning for the future. The candidates targeted in that first round of reobservations came from the first three years of SETI@home. There rest of the data his now being prepared for another round of reobservations. It may be that the true signal is right there, waiting to be discovered.