1. Connection Characteristics
According to the experimental data, it is convenient conceptually to divide connections into three classes: 1. Connections with small delay variation and little loss. Typical bursts of loss has width 1. 2. Connections with small delay variation and many small bursts of losses. Typical bursts of loss has width 1~3. 3. Connections with large delay variation (with or without bursts of losses). If we define "loss" to be either truly lost or too late to be played, the bursts of loss typically has width more than 6.
A specific real connection may behave quite differently during different time periods, displaying characteristics of different classes. For instance, our ping trace for BBC's web server hiyapal.bbc.com shows that in the day Nov. 20, for time period starting from 00:03, 05:03, and 20:03, the connection behaved like a class 1 connection. The delay varies between 30ms and 108ms, and delay standard deviation is less than 12ms. There is no packet loss. For the time period starting from 10:03, the connection behaved as a class 2 connection The delay standard deviation is 16ms. All bursts of loss are of width 1, 2, or 3 packets (Fig. 1.1).
Fig 1.1 To www.bbc.com time period 3: burstiness of loss
For the time period starting from 15:03, the connection behaved as a class 3 connection. The delay standard deviation is 590ms. Many bursts of loss include 10~40 packets (Fig. 1.2).
Another example is the connection to www.cornell.edu. Though most of the time the connection is of small delay 1~20ms and small delay variation < 1ms, with almost no loss, in some time period (Nov. 19, 00:02), the delay can vary between 1~118ms with standard deviation 15ms, and bursts of loss of width 1~6 were observed.
However, some connections tend to show consistent behavior in our ping trace. Local connections (eg. connections inside the Cornell campus) are class 1 connections in most of the time. And the connection to some remote sites, eg. altavista.com, utexas.edu is class 1 most of the time. Consistent characteristics may due to the stable traffic and stable bottleneck along the route. But more often, our trace files exhibit changing of connection characteristics over time is not the exception but the rule, especially for non-local connections.
Fig 1.2 To www.bbc.com time period 4: burstiness of loss
2. Performance of FEC for Different Connection Classes
For different connection classes, our adaptive FEC algorithm has drastically different performance. For class 1 connections, lost packets are rare and if there are packet loss, the bursts of loss typically are of width 1. When we apply the adaptive FEC algorithm, # of consecutive in-time packets is large, max # of consecutive missed packets is 0 or 1, # of packets tolerated by buffering delay is more than 1, so FEC_DELTA is almost always 1, regardless of the parameters we set (safety-factor, Beta). It's not surprising that FEC can save almost all the lost packets. For example, the connection from our sender machine (wrw3.resnet.cornell.edu) to a machine sitting in the CS department (mountaintop.cs.cornell.edu) exhibits class 1 behavior during time period 2~5 (See Appendix 1). Figure 2.1 shows the number of missed packets and FEC-saved packets in these periods. We can see that FEC saved all the lost packets so no lost packets are considered to be "missed" in the simulator.
Fig 2.1 FEC on class 1 connection
Class 2 connections are characterized by small delay variation and small bursts of losses. Typical bursts of loss has width 1~3. In the adaptive FEC algorithm, # of consecutive in-time packets is large, max # of consecutive missed packets is in the range 1~5, # of packets tolerated by buffering delay is more or less in the same range, so FEC_DELTA is in the range 1~5, decided by the minimum of the latter two parameters. Since packet loss is due to lossy connection other than large delay variation, FEC data are mostly in time to rescue, so FEC saves many of the lost packets. Fig. 2.2 shows the power of FEC for the connection from sender machine to www.ust.hk, which for the five periods exhibit class 2 characteristics (See Appendix 2). The information hasn't been shown in this graph is that all the received packets are normally played, and the sum of FEC-saved packets and missed packets are exactly those truly lost packets. In other words, FEC saved most of the lost packets.
Fig. 2.2 FEC on class 2 connection
Class 3 connections typically has large delay variation. Even if there is no truly lost packets, many packets arrive too late and are considered as "missed" by playout algorithm. So typical bursts of "loss" have width more than 6. In the adaptive FEC algorithm, # of consecutive in-time packets may not be large, max # of consecutive missed packets are typically > 6 (the width of burst can go up to more than 100!), # of packets tolerated by buffering delay depends on the estimation of the buffering delay (safety-factor + Beta * variation of delay). So FEC_DELTA is constrained by the estimation of the buffering delay in this case. For packets missed due to large delay variation, FEC data, typically arriving after the PCM data, will be too late to rescue. So for class 3 connections, though FEC can save some of the truly lost packets (if there are some) and late packets, generally FEC is not effective in this case. Fig 2.3 shows the missed and FEC-saved packets for the connection from sender machine to www.cielle.com, which exhibit class 3 behavior in time periods 1,3,5 (See Appendix 3). In time 1 and 5, no actual loss happened and FEC data (too late themselves) saved no packet at all. In time 3, there are lost packets and too-late-to-be-played packets, FEC saved around half of them.
Fig 2.3 FEC on class 3 connection
3. Tuning Beta for Playout and FEC
In playout algorithm, parameter Alpha is well-documented in [MJK98], whereas for Beta, [MJK98] only gave a range of 1~20. Using our simulator, we were able to get some experiment results with different Beta's (Fig. 3.1) Those results exactly match our conjecture of how changing of Beta affects the Playout and FEC performance, and can serve as the criteria for choosing Beta to maximize number of played packets subject to delay restriction.
As Beta increases, if the variance of the delay (v) is not very small, the estimation of the future delay
p_hat = d + Beta * v
becomes larger for each talkspurt. So the scheduled playout time is later than before and more packets will be played normally.
FEC may save more packets, especially for class 3 connections, because
- FEC delta may be larger. Since the estimation of buffering delay = safety-factor + Beta * v[0] is larger, so the third determining factor for FEC delta ---- number of packets tolerated by buffering delay is relaxed. This doesn't have much effect for class 1 and 2 cases since this third factor is not the constraining factor there. But for class 3 case, FEC delta constrained by this factor will be larger.
- More FEC data will be considered in time since the scheduled playout time is later.
These two reasons combined, FEC may save more packets. We skip class 1 case because FEC with delta 1 can deal with lost packets decently. Fig. 3.1 illustrates for class 2 connection (sender machine to www.ust.hk in time period 1), and for class 3 connections (sender machine to www.cielle.com in time period 3 and 5), the different effects of changing Beta. Only in the cielle time period 3 case, when FEC can actually save some lost and too-late packets, changing Beta will benefit us, at the cost of increasing the buffering delay. So the choice of Beta is a tradeoff between buffering delay and number of saved packets. And increasing Beta is only effective for some class 3 connections.
Fig 3.1 Tuning Beta for Playout and FEC
4. Tuning Safety-factor for Playout and FEC
As safety-factor increases,
n = transmission delay ( RTT/2 in simulator) + SAFETY_FACTOR
increases. d (average of n) and v (variance of n) increase. p_hat increases for each talkspurt. So the buffering delay increases. More packets are played normally.
FEC may save more packets for the same two reasons similar to the analysis for Beta.
Fig. 4.1 illustrates the effect of safety-factor on the delay and FEC performance for the same connections as in Fig. 3.1. Note that safety-factor directly affects delay of each packets we record, the increase of safety-factor will be directly reflected to d, v and p_hat, whereas the effect of changing Beta is amplified by variance of delay. So if the variance of delay is large (as in cielle time 5), the increase of normal packets due to changing of Beta is greater than that due to changing of safety-factor. If the variance of delay is small (as in ust time 1), reducing safety-factor to be 1x time_per_packet will affect FEC-saved packets more negatively than reducing Beta to 1.
Fig 4.1 Tuning safety-factor for Playout and FEC
5. Adaptive FEC vs. Fixed FEC
Since for class 1 connections, FEC delta set to 1 will suffice. We restrict our discussion of adpative FEC and fixed FEC only for class 2 and 3 cases. For class 2 connections (Fig. 5.1), basically FEC delta need to be small to match the loss pattern. If we fix FEC delta to some number > 5, most of the FEC data is too late to rescue, and the performance of FEC is even worse. (eg. ust FEC delta = 6).
Fig 5.1 Adaptive FEC vs. Fixed FEC for class 2 connection
For class 3 connections (Fig. 5.2), since the bursts of loss packets are of large width, basically FEC delta need to be large. But since the delay is worse, with large FEC delta, FEC data is prone to be late. So FEC delta is a subtle tradeoff of these two factors. For example, for cielle, seems FEC delta = 4 or 6 is best for time 3.
Fig 5.2 Adaptive FEC vs. Fixed FEC for class 3 connection
In both cases, adaptive FEC allows us to adjust the FEC delta according
to the current situation including the number of consecutive lost packets,
number of packets arriving in time in a row, and the number of packets
tolerated by current buffering delay. Moreover, though a connection can
be in different classes over time (eg. cielle in time 3 is class 3, but
in time 4 is class 2), adaptive FEC allows us to adjust to the connection
characteristics, choosing smaller or larger FEC delta over time. So it
almost always achieves the best performance as the "optimal" fixed FEC
delta for a certain connection during a specific time period
Appendix:
1. Results for mountaintop.cs.cornell.edu
2. Results for www.ust.hk
3. Results for www.cielle.com