new ISDB & reliability metric

Improved Indexer State Database (ISDB) Design and reliability metric for "track one" distributed searching improvements

(draft: do not copy or redistribute)

Target date: new release ready by September 30, 1997

Goals:

greatly improve our prediction of "will this indexer respond before the timeout value?"
reduce the number of searches that enter phase two
complete this work by September 30, 1997

Background questions about the ISDB and related topics

Current design of the ISDB

proposed design

ISDB
- what data are we storing
- how it will be used
Reliability metric
demotion
Reliability retry

ISDB

What data should we keep track of?

The indexer state database (ISDB) needs to have data allowing us to successfully predict if an indexer will respond before the timeout value.

We need:

Response time data. Recall that with track one changes, we will have response time data for any indexer responding before the end of phase two.

A weighting or decay mechanism so the most recent response time data affects our prediction more than older response time data.

A way to incorporate non-responding indexers ("failures") into the prediction model, as well as response time data.

We propose the ISDB keep the following data for each indexer:

Host
Port
Reinitialization timestamp (for demotion purposes)
Timed low pass filter for response time
- timestamp -- when this filter was last updated
- a response time prediction, in seconds, based on previous responses

Thanks (and in fact, much of the design) are owed to Robbert Van Renesse, David Fielding, Jim Davis and Jim French.

How will this work? How does it address our design requirements?

Clearly, we will be storing response time data. The timed low pass filter provides the weighting or decay mechanism, and we will be "fudging" the incorporation of non-responding indexers into the prediction model. Read on.

Why a timed low pass filter?

A timed low pass filter gives a reasonable indication of the current situation, based on the data we have most recently gathered. It meets our requirements: we can have the most recent data affect the output more than older data -- the weighting and decay mechanisms are built in. It is also easy to implement and requires little storage.

What about incorporating non-responding indexers into the prediction model?

This will be fudged, due to time constraints. When an indexer doesn't respond, we will pretend that it would have responded, had we waited long enough -- we will use a very high response time to update the filter in these cases. Thus non responses will greatly increase the predicted response time for an indexer, which will in turn make it less likely that we use the indexer.

It is *not* valid to assume that non-responding indexers would respond eventually; we are taking this approach merely to expedite the installation of this batch of performance improvements.

Suggestions for further updates and research, including other ways to handle non-responses from remote indexers are hinted at in the track two improvements and are also in Naomi's head.

How does a timed low pass filter work?

As the time since the last measurement increases, increasing weight is placed on the new measurement when computing a new value. If there was a lot of time since the last measurement was taken, the old value in the filter gets less weight in the computation of the new value. Here is the equation:

V_c' = ( f^D x V_c ) + ( (1 - f^D) x V_n )

V_c' = the new value to be entered in the filter

V_c = the current (old) value in the filter

V_n = the new measurement

f = a constant fraction

0 < f < 1

A value of 0 implies the filter has no memory; 1 means the value in the filter never changes. Perhaps we'll set it to 1/2 to start, or perform some empirical experiments and adjust it.

D = time since last measurement / average time between measurements

average time between measurements will either be a constant or it will be computed on the fly. Initial values will be derived from data on cs-tr.cs.cornell.edu and www.ncstrl.org.

Note that the decay of the data is exponential.

We will use this formula both to predict the response time for an indexer and to update the value in the filter.

How will the low pass filter be updated in the ISDB?

Updating ISDB if indexer responded:

If the indexer responded without error to the search, we can easily update the timed low pass filter for response time by plugging in the appropriate values. The timestamp will be changed to reflect the filter update time.

Updating ISDB if indexer didn't respond:

We are pretending that the indexer responded, but after we stopped listening. Since we know how long we were listening, we can add a "penalty" amount of time and use the total as our faux response time to plug into the equation. The "penalty" will mostly likely be a constant. At the present time, it is not clear what a good "penalty" value would be -- presumably we will perform some experiments after the code is written and then pick a value. As noted above, this whole approach is a fudge, so picking a "reasonable" value and devoting efforts to further research are indicated.

As with indexer responses, the timestamp will be updated as well.

Initializing ISDB entry (after startup):

The response time value will be initialized to the overall average response time for indexers (a constant). We will not be tracking average response times for each indexer at this point, though that is feasible and reasonable for future work. The timestamp will be set to the ISDB initialization time.

Re-initializing ISDB entry (when the reliability retry indicates it should be brought on-line again):

The beauty of the timed low pass filter is that we simply use the formula with the data from the retry search and the old values in the filters to get the new values.

Reliability metric

What is a good reliability metric?

One that answers the question "Is it reasonable to expect this indexer to respond before the timeout value?"

We propose that the reliability metric be the following:

We can predict the response time for an indexer with the timed low pass filter equation. We plug in the current (old) value in the filter as V_c and we plug in the overall average response time for indexers for V_n . This will give us a prediction of response time. We then compare this prediction to the timeout value.

If the predicted response time for this indexer is less than the timeout for the phase in question, then we should use this indexer in our search.

If the predicted response time for this indexer is greater than the timeout for the phase in question, then the indexer has failed the reliability metric: we don't believe it will respond before the timeout value and we do not wish to use it in our search. Instead, as in the current system, dienst will look for another indexer to use for the authority in question.

Demotion

If the answer to the question "is it reasonable to expect this indexer to respond before the timeout value?" is no, then we don't want to use this indexer -- we want to demote it. We do this by setting the reinitialization timestamp to the current time, just as in the current system. Unlike the current system, we demote the indexer as we apply the reliability metric, not as we incorporate new data into the ISDB. This is due to the nature of the timed low pass filter: as the time since the last measurement increases, the probability that the indexer will fail the reliability test decreases.

Note that with different timeout values for phase one and phase two, an indexer used in both phases may be demoted for one phase while it is not demoted for the other. There is no problem here: the reliability metric will be applied regardless of whether the reinitialization timestamp is zero or not. The reinitialization timestamp is really used to inform dienst of when to perform a reliability retry, not to indicate the status of an indexer.

Reliability retry

How should we carry out a "reliability retry" or "promotion" for a demoted indexer, and when should we do it?

After an indexer has been demoted, we want to see if we should "promote" it back into service after some time interval has passed. As in the current system, the time interval is the global variable $fail_retry_time in Config/config_constants.pl, and it is checked against the reinitialization timestamp when indexers are being chosen for a search.. Note that in the track one distributed search improvements, this retry will no longer take place as part of a user search. Instead, the retry will be a separate spawned dienst process sending first a "version" request to the remote indexer, and then a search.

First, we need to make sure this indexer is still demoted. If the timestamp on the response time filter is newer than the reinitialization timestamp, then we apply the reliability metric for the relevant timeout (phase one or phase two). If the indexer passes reliability, then we simply reset the reinitialization timestamp to zero, and we're done.

If the filter timestamp has not been altered since the demotion, or if the indexer fails the reliability metric, then we go ahead with the reliability retry. It is at this point that we spawn a dienst process, using the spawned process to send a version request to the remote indexer. If the remote indexer doesn’t respond to the version request before the timeout value, then we reset the reinitialization timestamp to the current time -- we renew the demotion. Note that failing the retry affects the ISDB entry the same way as failing the reliability metric: we simply renew the demotion by updating the reinitialization timestamp.

If the version request is successful, then we perform a search on the indexer and see if it responds before the timeout value. If the search at the remote indexer responds before the timeout value, then we reinitialize the ISDB entry: we set the reinitialization timestamp to zero and add our search response data to the filter.