CS5414 (Fall 2012) Homework 4
Replication Architectures
Due: 11pm, 10/15/2012
Relative weight: 10 units
General Instructions.
You are expected to work alone on this assignment.
Submit your solution using CMS.
No late assignments will be accepted.
To facilitate grading, format your solutions as follows.
-
Put your name and net id on each page of your solution.
-
Typeset your solution, using 10 point or larger font and use 8.5 x 11 inch paper.
-
Use at most one page (both sides, if necessary) for each problem.
-
Start each problem's solution on a separate side.
Solutions that do not satisfy the formatting guidelines will be
returned, ungraded.
Problem 1
A stage is a process that, repeatedly,
receives a single input value,
applies a function to that value,
and then sends the result.
DO FOREVER
RECEIVE v FROM input
val := F(v)
SEND val TO output
END
Here, F(v) denotes a function evaluation
in the mathematical sense---the value of F(v) is unique,
depending only on v and not on other state or history.
So each output sent by a stage is completely independent of previous inputs
it has read and processed.
Given is a collection of stages S_1, S_2, ..., S_N,
where each stage potentially employs a different function for F.
We can construct a
stage pipeline
S_1 | S_2 | ... | S_n
by routing the output of each stage to the input of another,
in a linear fashion.
In particular. we connect the outputs stream from each stage S_i
(where i < N holds)
to the inputs stream to another stage S_{i+1}.
The inputs stream for the stage pipeline is defined to be the inputs stream to the first
stage S_1;
the outputs stream from the stage pipeline is defined to be the outputs stream from the
last stage S_N.
-
A causal message logging approach is being considered for
implementing fault-tolerance of a stage pipeline comprising N stages.
Assume Lamport's "happens before" relation is used to define causality for this system.
Describe the set of messages whose determinants must be logged by S_N
when it reads the kth input from its input channel.
-
Assume that a complete semantic characterization for causality in this system
is available and being employed by causal logging protocols.
(By thinking about the logic of the system you should be able to
derive such a characterization.)
Describe the set of messages whose determinants must be logged by S_N
when it reads the kth input from its input channel.
Problem 2
Replication and masking can be employed to create an
ensemble that has the same input/output
properties as a single stage but can tolerate as many as t stage replicas
that exhibit Byzantine behavior, as follows.
-
Create 2t+1 replicas of the stage, running each replica on a separate processor.
-
Use a suitable "consensus protocol" to ensure that all stage
replicas receive identical copies of their inputs stream.
-
Employ a 2t+1 input voter to generate the outputs stream from the
2t+1 replica ensemble of the stages.
Specifically, the outputs stream of each replica is connected to
a different one of the voter's inputs;
the ensemble's outputs stream is the stream of outputs that the voter then produces.
You may assume that the "consensus protocol"
never fails given and that no voter ever fails.
Here are two possible approaches for using replication to create a reliable
N-stage pipeline.
-
For each stage S_i, create an ensemble by
using the construction outlined above.
Then, connect these ensembles in sequence.
Thus, the voter output from each ensemble (except the last)
is read by the consensus protocol and
becomes the inputs sequence
to the replicas comprising the ensemble for successor stage S_{i+1}.
-
A stage pipeline SP viewed in toto has a
single inputs sequence and a single outputs sequence, so it can
itself be considered to be a single (large) stage.
Construct 2t+1 replicas SP_1, SP_2, ..., SP_N of stage pipeline SP, by
using the construction outlined above.
Thus, this new system has (only) a single voter,
which generates the outputs stream based on the outputs stream produced
by each of the 2t+1 stage pipeline replicas.
For both architectures, assume that processors executing pipeline stages can exhibit
Byzantine failures but assume that voters, communication lines, etc are fault-free
-
Characterize the sets of failures each architecture tolerates, as follows.
What is the smallest number of faulty processors that could cause erroneous output?
What is the largest number of faulty processors
that could be tolerated without necessarily causing erroneous output?
-
The delay in a stage pipeline is the time that elapses
from when an input is read by
the first stage until the corresponding output is produced by the last stage.
Give a characterization of the
delay exhibited by each of
the two architectures when there are no failures but
assuming that each different processor might run at some fixed but different speed.
Is one architecture likely to be faster than the other?
Justify this claim.