Stagewise processing in error-correcting 
codes and image restoration 
K. Y. Michael Wong 
Department of Physics, Hong Kong University of Science and Technology, 
Clear Water Bay, Kowloon, Hong Kong 
phkywongust. hk 
Hidetoshi Nishimori 
Department of Physics, Tokyo Institute of Technology, 
Oh-Okayama, Meguro-ku, Tokyo 152-8551, Japan 
nishi stat. phys. titech. ac.jp 
Abstract 
We introduce stagewise processing in error-correcting codes and 
image restoration, by extracting information from the former stage 
and using it selectively to improve the performance of the latter 
one. Both mean-field analysis using the cavity method and sim- 
ulations show that it has the advantage of being robust against 
uncertainties in hyperparameter estimation. 
1 Introduction 
In error-correcting codes [1] and image restoration [2], the choice of the so-called 
hyperparameters is an important factor in determining their performances. Hyper- 
parameters refer to the coefficients weighing the biases and variances of the tasks. 
In error correction, they determine the statistical significance given to the parity- 
checking terms and the received bits. Similarly in image restoration, they determine 
the statistical weights given to the prior knowledge and the received data. It was 
shown, by the use of inequalities, that the choice of the hyperparameters is opti- 
mal when there is a match between the source and model priors [3]. Furthermore, 
from the analytic solution of the infinite-range model and the Monte Carlo simula- 
tion of finite-dimensional models, it was shown that an inappropriate choice of the 
hyperparameters can lead to a rapid degradation of the tasks. 
Hyperparameter estimation is the subject of many studies such as the "evidence 
framework" [4]. However, if the prior models the source poorly, no hyperparameters 
can be reliable [5]. Even if they can be estimated accurately through steady-state 
statistical measurements, they may fluctuate when interfered by bursty noise sources 
in communication channels. Hence it is equally important to devise decoding or 
restoration procedures which are robust against the uncertainties in hyperparameter 
estimation. 
Here we introduce selective freezing to increase the tolerance to uncertainties in hy- 
perparameter estimation. The technique has been studied for pattern reconstruc- 
tion in neural networks, where it led to an improvement in the retrieval precision, 
a widening of the basin of attraction, and a boost in the storage capacity [6]. The 
idea is best illustrated for bits or pixels with binary states +l, though it can be eas- 
ily generalized to other cases. In a finite temperature thermodynamic process, the 
binary variables keep moving under thermal agitation. Some of them have smaller 
thermal fluctuations than the others, implying that they are more certain to stay 
in one state than the other. This stability implies that they have a higher probabil- 
ity to stay in the correct state for error-correction or image restoration tasks, even 
when the hyperparameters are not optimally tuned. It may thus be interesting to 
separate the thermodynamic process into two stages. In the first stage we select 
those relatively stable bits or pixels whose time-averaged states have a magnitude 
exceeding a certain threshold. In the second stage we subsequently fix (or freeze) 
them in the most probable thermodynamic states. Thus these selectively frozen 
bits or pixels are able to provide a more robust assistance to the less stable bits or 
pixels in their search for the most probable states. 
The two-stage thermodynamic process can be studied analytically in the mean-field 
model using the cavity method. For the more realistic cases of finite dimensions 
in image restoration, simulation results illustrate the relevance of the infinite-range 
model in providing qualitative guidance. Detailed theory of selective freezing is 
presented in [7]. 
2 Formulation 
Consider an information source which generates data represented by a set of Ising 
spins {i}, where i = +1 and i = 1,..., N. The data is generated according to the 
source prior Ps({i}). For error-correcting codes transmitting unbiased messages, 
all sequences are equally probable and Ps({}) - 2 -N. For images with smooth 
structures, the prior consists of ferromagnetic Boltzmann factors, which increase 
the tendencies of the neighboring spins to stay at the same spin states, that is, 
Here {i j} represents pairs of neighboring spins, z is the valency of each site. The 
data is coded by constructing the codewords, which are the products of p spins 
J9 - = ii '" ip for appropriately chosen sets of of indices {i,... ip}. Each spin 
may appear in a number of Fspin codewords; the number of times of appearance 
is called the valency Zp. For conventional image restoration, codewords with only 
p = I are transmitted, corresponding to the pixels in the image. 
When the signal is transmitted through a noisy channel, the output consists of 
the sets {Ji..%} and {i}, which are the corrupted versions of {J..%} and 
respectively, and described by the output probability 
Pout ({J}, {r}l{}) exp (a E Ji...ii '"i +  E fiji). (2) 
According to Bayesian statistics, the posterior probability that the source sequence 
is {a}, given the outputs {J} and {r}, takes the form 
(3) 
If the receiver at the end of the noisy channel does not have precise information on 
fa, fir or rs, and estimates them as f, h and fm respectively, then the ith bit of 
the decoded/restored information is given by sgn{cri), where 
Traie-U{ } 
(ai}= Tre_U{) , (4) 
and the Hamiltonian is given by 
H{cr} = -f y'. Jii...ipo'ii ...o'ip 
For the two-stage process of selective freezing, the spins evolve thermodynamically 
as prescribed in Eq. (4) during the first stage, and the thermal averages {cri/ of 
the spins are monitored. Then we select those spins with I{ai/I exceeding a given 
threshold 0, and freeze them in the second stage of the thermodynamics. The 
average of the spin 5i in the second stage is then given by 
= , (6) 
Tr rI [o (<cry> 2 - 02) ,sgn<> q- O (02 -- <Gj>2)] -{} 
where O is the step function, /{5} is the Hamiltonian for the second stage, and 
has the same form as Eq. (5) in the first stage. One then regards sgn{&i/as the ith 
spin of the decoding/restoration process. 
The most important quantity in selective freezing is the overlap of the decod- 
ed/restored bit sgn(Si/ and the original bit i averaged over the output probability 
and the spin distribution. This is given by 
M, = EII / dJII / drPs({})Pout({J}, 
(7) 
Following [3], we can prove that selective freezing cannot outperform the single-stage 
process if the hyperparameters can be estimated precisely. However, the purpose 
of selective freezing is rather to provide a relatively stable performance when the 
hyperparameters cannot be estimated precisely. 
3 Modeling error-correcting codes 
Let us now suppose that the output of the transmission channel consists of only the 
set of p-spin interactions {Ji...ip}. Then h - 0 in the HamiltonJan (5), and we set 
fm = 0 for the case that all messages are equally probable. Analytical solutions are 
available for the infinite-range model in which the exchange interactions are present 
for all possible pairs of sites. Consider the noise model in which Ji..% is Gaussian 
with mean P!joi" 'ip/Np- and variance p!J2/2NP-. We can apply a gauge 
transformation cr i -- cri i and Ji...ip -- Ji...ip i '"ip, and arrive at an equivalent 
p-spin model with a ferromagnetic bias, where 
Np_ ) 1/2 
P(Ji..%) = k, rjp! 
exp j2p! Ji ...ip Np_ 1 jo  (8) 
The infinite-range model is exactly solvable using the cavity method [8]. The 
method uses a self-consistency argument to consider what happens when a spin 
is added or removed from the system. The central quantity in this method is the 
cavity field, which is the local field of a spin when it is added to the system, assuming 
that the exchange couplings act only one-way from the system to the new spin (but 
not from the spin back to the system). Since the exchange couplings feeding the 
new spin have no correlations with the system, the cavity field becomes a Gaussian 
variable in the limit of large valency. 
The thermal average of a spin, say spin 1, is given by 
(cry) - tanh h, 
(9) 
where h is the cavity field obeying a Gaussian distribution, whose mean and vari- 
ance are pjom p- and pJ2qP-/2 respectively, where m and q are the magnetization 
and Edwards-Anderson order parameter respectively, given by 
1 
1 (ai} and q: N (ai)2' (10) 
m= N 
i 
Applying self-consistently the cavity argument to all terms in Eq. (10), we can 
obtain self-consistent equations for m and q. 
Now we consider selective freezing. If we introduce a freezing threshold 0 so that 
all spins with {cri} 2 > 02 are frozen, then the freezing fraction f is given by 
1 
f -- N  (9 ((cri> 2- 02). (11) 
i 
The thermal average of a dynamic spin in the second stage is related to the cavity 
fields in both stages, say, for spin 1, 
{ff/= tanh { + (p- 1)j2rp-2xtrtanhh}, (12) 
where h is the cavity field in the second stage, r is the order parameter describing 
the spin correlations of the two thermodynamic stages: 
_ 1 
r _ N (ai) {(5i)0 [02-(cri) 2] +sgn(ai)O [(cri) 2 - 02]), (13) 
Xtr is the trans-susceptibility which describes the response of a spin in the second 
stage to variations of the cavity field in the first stage, namely 
_ I O(ai) (14) 
Xtr- N 2" Ohi 
i 
The cavity field h is a Gaussian variable. Its mean and variance are pjoh p- 
and pJ2P-/2 respectively, where h and  are the magnetization and Edwards- 
Anderson order parameter respectively, given by 
1 
rh ---- [O(02-(ai)2)(Si)+O((ai)2-O2)sgn{ai}], (15) 
i 
1 
 -- y. [0(02- {cri)2){si) 2 +O({ai) 2 -02)]. (16) 
Furthermore, the covariance between h and h is pJ2rP-/2, where r is given in 
Eq. (13). Applying self-consistently the same cavity argument to all terms in Eqs. 
(15), (16), (13) and (14), we arrive at the self-consistent equations for m, q, r and 
Xtr. The performance of selective freezing is measured by 
1 
Msf ---- N y" [0(02 - (cri)2)sgn(i) + O((ai)2 -02)sgn(cri)]' (17) 
i 
0,93 0,95 
0.91 
0.89 
0.87 
0.85 
o  f=O 
  f=0.7 
< <> f=0.8 
  f=O.9 
0.94 
0.93 
o  f=O 
  f=0.7 
  f=0.8 
[] [] f=O.9 
0.83 0.92 
0.0 0.2 0.4 0.6 0.8 1.0 0.3 0.6 0.9 1.2 1.5 
T T 
Figure 1: The overlap Msf as a function of the decoding temperature T for various 
given values of freezing fraction f. In this and the following figure, f - 0 corre- 
sponds to one-stage decoding/restoration. (a) Theoretical results for p = 3, j0 = 0.8 
and J = 1; (b) results of Monte Carlo simulations for p = 2 and j0 = J = 1. 
In the example in Fig. l(a), the overlap of the single-stage dynamics reaches its 
maximum at the Nishimori point Tv = J2/2jo as expected. We observe that the 
tolerance against variations in T is enhanced by selective freezing both above and 
below the optimal temperature (see especially f - 0.8). This shows that the region 
of advantage for selective freezing is even broader than that discussed in [7], where 
improvement is only observed above the optimal temperature. 
The advantages of selective freezing are confirmed by Monte Carlo simulation- 
s shown in Fig. l(b). For one-stage dynamics, the overlap is maximum at the 
Nishimori point (TN = 0.5) as expected. However, it deterriorates rather rapidly 
when the decoding temperature increases. In contrast, selective freezing maintains 
a more steady performance, especially when f = 0.9. 
4 Modeling image restoration 
In conventional image restoration problems, a given degraded image consists of the 
set of pixels {-i}, but not the set of exchange interactions { Ji,...,ip }. In this case, 
fi = 0 in the Hamiltonian (5). The pixels -i are the degraded versions of the source 
pixels i, corrupted by noise which, for convenience, is assumed to be Gaussian with 
mean ai and variance _2. In turn, the source pixels satisfy the prior distribution 
in Eq. (1) for smooth images. 
Analysis of the mean-field model with extensive valency shows that selective freez- 
ing performs as well as one-stage dynamics, but cannot outperform it. Nevertheless, 
selective freezing provides a rather stable performance when the hyperparameters 
cannot be estimated precisely. Hence we model a situation common in modern com- 
munication channels carrying multimedia traffic, which are often bursty in nature. 
Since burstiness results in intermittent interferences, we consider a distribution of 
the degraded pixels with two Gaussian components, each with its own characteris- 
tics, 
exp [-- (-i- a,) ] (18) 
+ (1 - f) 2 
0.92 0.92 
0.91 
0.90 
/   f=O 
/ []  f=0.1 
 > > f=0.3 
f   f=0.5 
  f=0.7 
v v f=0.9 
0.90 
0.88 
0.86 
0.84 
c o f=O 
v v f=0.3 
  f=0.5 
  f=0.7 
e---.-e f=O.9 
[] [] f=0.95 
0.89 0.82 ' ' ' ' ' 
0.0 0.5 1.0 1.5 2.0 0 1 2 3 4 7 
h T m 
Figure 2: (a) The performance of selective freezing with 2 components of Gaussian 
noise at fis = 1.05, f = 4f2 = 0.8, a = 5a2 = I and - = - = 0.3, The restoration 
agent operates at the optimal ratio fim/h which assumes a single noise component 
with the overall mean 0.84 and variance 0.4024. (b) Results of Monte Carlo sim- 
ulations for the overlaps of selective freezing compared with that of the one-stage 
dynamics for two-dimensional images generated at the source prior temperature 
Ts - 2.15. 
Suppose the restoration agent operates at the optimal ratio of/m/h which assumes 
a single noise component. Then there will be a degradation of the quality of the 
restored images. In the example in Fig. 2(a), the reduction of the overlap Ms 
for selective freezing is much more modest than the one-stage process (f = 0). 
Other cases of interest, in which the restoration agent operates on other imprecise 
estimations, are discussed in [7]. All confirm the robustness of selective freezing. 
It is interesting to study the more realistic case of two-dimensional images, since we 
have so far presented analytical results for the mean-field model only. As confirmed 
by the results for Monte carlo simulations in Fig. 2(b), the overlaps of selective 
freezing are much more steadier than that of the one-stage dynamics when the 
decoding temperature changes. This steadiness is most remarkable for a freezing 
fraction of f = 0.9. 
5 Discussions 
We have introduced a multistage technique for error-correcting codes and image 
restoration, in which the information extracted from the former stage can be used 
selectively to improve the performance of the latter one. While the overlap Msf 
of the selective freezing is bounded by the optimal performance of the one-stage 
dynamics derived in [3], it has the advantage of being tolerant to uncertainties in 
hyperparameter estimation. This is confirmed by both analytical and simulational 
results for mean-field and finite-dimensional models. Improvement is observed both 
above and below the optimal decoding temperature, superseding the observations 
in [7]. As an example, we have illustrated its advantage of robustness when the 
noise distribution is composed of more than one Gaussian components, such as in 
the case of modern communication channels supporting multimedia applications. 
Selective freezing can be generalized to more than two stages, in which spins that 
remain relatively stable in one stage are progressively frozen in the following one. 
It is expected that the performance can be even more robust. 
On the other hand, we have a remark about the basic assumption of the cavity 
method, namely that the addition or removal of a spin causes a small change in 
the system describable by a perturbative approach. In fact, adding or removing a 
spin may cause the thermal averages of other spins to change from below to above 
the thresholds --t) (or vice versa). This change, though often small, induces a non- 
negligible change of the thermal averages from fractional values to the frozen values 
of +1 (or vice versa) in the second stage. The perturbative analysis of these changes 
is only approximate. The situation is reminiscent of similar instabilities in other 
disordered systems such as the perceptron, and are equivalent to Almeida-Thouless 
instabilities in the replica method [9]. A full treatment of the problem would require 
the introduction of a rough energy landscape [9], or the replica symmetry breaking 
ansatz in the replica method [8]. Nevertheless, previous experiences on disordered 
systems showed that the corrections made by a more complete treatment may not 
be too large in the ordered phase. For example, simulational results in Figs. l(b) 
are close to the corresponding analytical results in [7]. 
In practical implementations of error-correcting codes, algorithms based on belief- 
propagation methods are often employed [10]. It has recently been shown that 
such decoded messages converge to the solutions of the TAP equations in the corre- 
sponding thermodynamic system [11]. Again, the performance of these algorithms 
are sensitive to the estimation of hyperparameters. We propose that the selective 
freezing procedure has the potential to make these algorithms more robust. 
Acknowledgments 
This work was partially supported by the Research Grant Council of Hon Kon 
(HKUST6157/99P). 
References 
[1] R. J. McEliece, The Theory of Information and Coding, Encyclopedia of Mathematics 
and its Applications (Addison-Wesley, Reading, MA 1977). 
[2] S. Geman and D. Geman, IEEE Trans. PAMI 6, 721 (1984). 
[3] H. Nishimori and K. Y. M. Wong, Phys. Rev. E 60, 132 (1999). 
[4] D. J. C. Mackay, Neural Computation 4, 415 (1992). 
[5] J. M. Pryce and A.D. Bruce, J. Phys. A 28, 511 (1995). 
[6] K. Y. M. Wong, Europhys. Lett. 36, 631 (1996). 
[7] K. Y. M. Wong and H. Nishimori, submitted to Phys. Rev. E (2000). 
[8] M. Mzard, G. Parisi, and V.A. Virasoro, Spin Glass Theory and Beyond (World 
Scientific, Singapore 1987). 
[9] K. Y. M. Wong, Advances in Neural Information Processing Systems 9,302 (1997). 
[10] B. J. Frey, Graphical Models for Machine Learning and Digital Communication (MIT 
Press, 1998). 
[11] Y. Kabashima and D. Saad, Europhys. Lett. 44, 668 (1998). 
