Spike-Timing-Dependent Learning for 
Oscillatory Networks 
Silvia Scarpetta 
Dept. of Physics "E.R. Caianiello" 
Salerno University 84081 (SA) Italy 
and INFM, Sezione di Salerno Italy 
scarpettana.infn. it 
Zhaoping Li 
Gatsby Comp. Neurosci. Unit 
University College, London, WCIN 3AR 
United Kingdom 
zhaopinggatsby. ucl. ac. uk 
John Hertz 
Nordita 
2100 Copenhagen 0, Denmark 
hertznordita. dk 
Abstract 
We apply to oscillatory networks a class of learning rules in which 
synaptic weights change proportional to pre- and post-synaptic ac- 
tivity, with a kernel A(-) measuring the effect for a postsynaptic 
spike a time - after the presynaptic one. The resulting synaptic ma- 
trices have an outer-product form in which the oscillating patterns 
are represented as complex vectors. In a simple model, the even 
part of A(-) enhances the resonant response to learned stimulus by 
reducing the effective damping, while the odd part determines the 
frequency of oscillation. We relate our model to the olfactory cortex 
and hippocampus and their presumed roles in forming associative 
memories and input representations. 
I Introduction 
Recent studies of synapses between pyramidal neocortical and hippocampal neu- 
rons [1, 2, 3, 4] have revealed that changes in synaptic efficacy can depend on the 
relative timing of pre- and postsynaptic spikes. Typically, a presynaptic spike fol- 
lowed by a postsynaptic one leads to an increase in efficacy (LTP), while the reverse 
temporal order leads to a decrease (LTD). The dependence of the change in synap- 
tic efficacy on the difference - between the two spike times may be characterized 
by a kernel which we denote A(-) [4]. For hippocampal pyramidal neurons, the 
half-width of this kernel is around 20 ms. 
Many important neural structures, notably hippocampus and olfactory cortex, ex- 
hibit oscillatory activity in the 20-50 Hz range. Here the temporal variation of the 
neuronal firing can clearly affect the synaptic dynamics, and vice versa. In this 
paper we study a simple model for learning oscillatory patterns, based on the struc- 
ture of the kernel A(-) and other known physiology of these areas. We will assume 
that these synaptic changes in long range lateral connections are driven by oscilla- 
tory, patterned input to a network that initially has only local synaptic connections. 
The result is an imprinting of the oscillatory patterns in the synapses, such that 
subsequent input of a similar pattern will evoke a strong resonant response. It can 
be viewed as a generalization to oscillatory networks with spike-timing-dependent 
learning of the standard scenario whereby stationary patterns are stored in Hopfield 
networks using the conventional Hebb rule. 
2 Model 
The computational neurons of the model represent local populations of biological 
neurons that share common input. They follow the equations of motion [5] 
0 
iti = -oui- /igv(vi) + Y] Jig,(uj) + Ii, (1) 
J 
o 
= + (2) 
and vi are membrane potentials for excitatory and inhibitory (formal) 
Here u i 
neuron i, a - is their membrane time constant, and the sigmoidal functions 
gu( ) and gv( ) model the dependence of their outputs (interpreted as instanta- 
neous firing rates) on their membrane potentials. The couplings /3  and %0 are 
inhibitory-to-excitatory (resp. excitatory-to-inhibitory) connection strengths within 
local excitatory-inhibitory pairs, and for simplicity we take the external drive Ii(t) 
to act only on the excitatory units. We include nonlocal excitatory couplings 
between 0 
excitatory units and Wj from excitatory units to inhibitory ones. In this 
minimal model, we ignore long-range inhibitory couplings, appealing to the fact 
that real anatomical inhibitory connections are predominantly short-ranged. (In 
what follows, we will sometimes use bold and sans serif notation (e.g., u, J) for 
vectors and matrices, respectively.) The structure of the couplings is shown in Fig. 
1A. 
The model is nonlinear, but here we will limit our treatment to an analysis of small 
oscillations around a stable fixed point {fi, V} determined by the DC part of the 
input. Performing the linearization and eliminating the inhibitory units [6, 5], we 
obtain 
/i q- [2o - J]fl q- [o? q-/(-/q- W) - oJ]u = (Gqt q- ct)6I. (3) 
Here u is now measured from the fixed point fi, 61 is the time-varying part of the 
input, and the elements of J and W are related to those of jo and W  by Wij = 
g'(aj)Wi and Jij = g'(aj)Ji. For simplicity, we have assumed that the effective 
- 0 t - 0 
local couplings /i = gtv(Vi)i and 7i = gu(tti)7i are independent of i: /i = /, 
"/i = % With oscillatory inputs 61 = e -it q- c.c., the oscillatory pattern elements 
i -- Iil e-i are complex, reflecting possible phase differences across the units. 
We likewise separate the response u = u + + u- (after the initial transients) into 
positive- and negative-frequency components u  (with u- = u +* and u  cr e=it). 
Since fi = q=iu , Eqn. (3) can be written 
[2ct4-(ct2+/"/-2)]u=Mu+(14-i)6I , (4) 
a form that shows how the matrix 
M(a]) -- J :F i(/W - oJ). (5) 
describes the effective coupling between local oscillators. 2ct is the intrinsic damping 
and X/a 2 +/7 the frequency of the individual oscillators. 
A B.1 B.2 
Figure 1: A. The model: In addition to the local excitatory-inhibitory connections 
(vertical solid lines), there are nonlocal long-range connections (dashed lines) be- 
tween excitatory units (Jij) and from excitatory to inhibitory units (Wij). External 
inputs are fed to the excitatory units. B: Activation function used in simulations 
for excitatory units (B.1) and inhibitory units (B.2). Crosses mark the equilibrium 
point (, ) of the system. 
2.1 Learning phase 
We employ a generalized Hebb rule of the form 
(Cij(t) = ] dt dTyi(t + T)A()xj(t) (6) 
for synaptic weight Cij, where xj and Yi are the pre- and postsynaptic activities, 
measured relative to stationary levels at which no changes in synaptic strength 
occur. We consider a general kernel A(-), although experimentally A(-)  0 ( 0) 
for -  0 ( 0). We will apply the rule to both J and W in our linearized network, 
where the firing rates gu(ui) and gv(vi) vary linearly with ui and vi, so we will 
use Eqn. (6) with xj - uj and Yi - ui or vi (measured from the fixed point 
respectively. 
We assume oscillatory input 5I = e-iWt q-c.c. during learning. In the brain 
structures we are modeling, cholinergic modulation makes the long-range connec- 
tions ineffective during learning [7]. Thus we set J - W - 0 in Eqn. (3) and 
find 
 0 -iwot 
(c) 0 q- 10)i e -- rr cO^--kOot 
W = + + = (7) 
and, from (Or + c)vi = 7ui, 
 rr cO^--iwot 
V? -- __lid 0 q- O0qi c . (8) 
Using these in the learning rule (6) leads to 
F(woVOeO. ] 
Jij 2JoRe [(COo]FF q /   
----  . /'iSj J, Wij = 2(r/w/t/)Jo7Re L --io , (9) 
where .4(co) = f_%d-A(-)e -i is the Fourier transform of A('r), Jo = 
27rl IUo I/wo, and /a(w) are the respective learning rates. When the rates are tuned 
Jo(woe * 
such that / = lw"//?/(c  + w) and when w = wo, we have Mi = . , j ,a 
generalization of the outer-product learning rule to the complex patterns u from 
the Hopfield-Hebb form for real-valued patterns. For learning multiple patterns u, 
/ = 1, 2, ..., the learned weights are simply sums of contributions from individual 
patterns like Eqns. (9) with 6/0 replaced by . 
2.2 Recall phase 
We return to the single-pattern problem and study the simple case when j = 
rlw"/fi/( 2 + co). Consider first an input pattern dI = e -iwt + c.c. that matches 
the stored pattern exactly ( = 0), but possibly oscillating at a different frequency. 
We then find, using Eqns. (9) in Eqn. (3), the (positive-frequency) response 
u + = (co + is)g e-'* (lO) 
2.co - Jo(co + coo)i'(coo) + i[..+/7 . _ ao(co + coo)A"(coo) - co'' 
2 
where ' (coo) = Re (coo) and " (coo) = Im (coo). For strong response at co = coo, 
we require 
coo = V/ '+/-- ocoo:i"(coo), o:i'(coo)  2. (11) 
This means (1) the resonance frequency coo is determined by ", (2) the effective 
damping 2a- Jo' should be small, and (3) deviation of co from coo reduces the 
responses. 
It is instructive to consider the case where the width of the time window for synaptic 
change is small compared with the oscillation period. Then we can expand :(coo) 
in coo: 
:-'(coo)  f d'rA('r) -- ao, 
In particular, A(r) = d(r) gives ao 
learning [5]. Experimentally, at > 0 
intrinsic local frequency, V/ct2 +/?7 
" (coo) m -coo f d'r'rA('r) -- -cooat. (12) 
= I and at = 0 and the conventional Hebbian 
, implying a resonant frequency greater than the 
obtained in the absence of long-range coupling. 
If the drive  does not match the stored pattern (in phase and amplitude), the 
response will consist of two terms. The first has the form of Eqn. (10) but reduced 
in amplitude by an overlap factor 0, . . (For convenience we use normalized 
pattern vectors.) The second term is proportional to the part of  orthogonal to 
the stored pattern. The J and W matrices do not act in this subspace, so the 
frequency dependence of this term is just that of uncoupled oscillators, i.e., Eqn. 
(10) with J0 set equal to zero. This response is always highly damped and therefore 
small. 
It is straightforward to extend this analysis to multiple imprinted patterns. The 
response consists of a sum of terms, one for each stored pattern. The term for each 
stored pattern is just like that just described in the single-stored-pattern case: it 
has one part for the input component parallel to the stored pattern and another 
part for the component orthogonal to the stored pattern. 
We note that, in this linear analysis, an input which overlaps several stored pat- 
terns will (if the imprinting and input frequencies match) evoke a resonant response 
which is a linear combination of the stored patterns. Thus, a network tuned to 
operate in a nearly linear regime is able to interpolate in forming its representation 
of the input. For categorical associative memory, on the other hand, a network has 
to work in the extreme nonlinear limit, responding with only the strongest stored 
pattern in an input mixture. As our network operates near the threshold for sponta- 
neous oscillations, we expect that it should exhibit properties intermediate between 
A 
2OO 
100 
(Hz) 
50 60 
20 
<E1o 
B 
oo o oo ( 
0 
02 04 06 08 
Overlap 
c 
)o 
 O0  O0 
 EE) , 
45 
Inputangle(degrees) 
90 
Figure 2: Circles show non-linear simulation results, stars show the linear simulation 
results, while the dotted line show the analytical prediction for the linearized model. 
A. Importance of frequency match: amplitude of response of output units as a 
function of the frequency of the current input. The frequency of the imprinted 
pattern is 41 Hz. B.Importance of amplitude and phase mismatch: amplitude of 
response as a function of overlap between current input and imprinted pattern (i.e. 
I *  l), for different presented input patterns . C: Input - output relationship 
when two orthogonal patterns ( and (2, have been imprinted at the same frequency 
 = 41Hz. The angle of input pattern with respect to ( is shown as a function of 
the angle of the output pattern with respect to  , for many different input patterns. 
these limits. We find that this is indeed the case in the simulations reported in 
the next section. From our analysis it turns out that the network behaves like a 
Hopfield-memory (separate basins, without interpolation capability) for patterns 
with different imprinting frequencies, but at the same time it is able to interpolate 
among patterns which share a common frequency. 
3 Simulations 
Checking the validity of our linear approximation in the analysis, we performed 
numerical simulations of both the non-linear equations (1,2) and the linearized ones 
(3). We simulated the recall phase of a network consisting of 10 excitatory and 10 
inhibitory cells. The connections Jij and Wij were calculated from Eqns. (9), where 
we used the approximations (12) for the kernel shape A_(w). Parameters were set 
in such a way that the selective resonance was in the 40-Hz range. In non-linear 
simulations we used different piecewise linear activation functions for gu( ) and gv(), 
as shown in Fig. lB. We chose the parameters of the functions gu( ) and gv( ) so that 
the network equilibrium points ui, vi were close to, but below, the high-gain region, 
i.e. at the points marked with crosses in Fig. lB. 
The results confirm that when the input pattern matches the imprinted one in 
frequency, amplitude and phase, the network responds with strong resonant oscil- 
lations. However, it does not resonate if the frequencies do not match, as shown 
in the frequency tuning curve in Fig. 2A. The behavior when the two frequencies 
are close to each other differs in the linear and nonlinear cases. However, in both 
cases a sharp selectivity in frequency is observed. The dependence on the overlap 
between the input and the stored pattern is shown in Fig. 2B. The non-linear case, 
indicated by circles, should be compared with the linear case, where the amplitude 
is always linear in the overlap. In the nonlinear case, the linearity holds roughly 
only for overlaps lower than about 0.4; for larger overlaps the amplification is as 
high as for the perfect match case. This means that input patterns with an overlap 
with the imprinted one greater than 0.4 lie within the basis of attraction of the 
0 200 400 
= o 
-50 
2 
 o 
-50  
2 
= 
0 200 400 
= o 
-50 
0 200 400 0 200 400 
200 400 
0 200 400 0 200 400 
50 50 [ 
= o = o 
-50 ' ' -50 ' 
0 200 400 0 200 400 
0 200 400 
50 
 0 vvwvwwvwvwvwwwwww 
-50 
0 200 400 0 200 400 
Figure 3: Frequency selectivity: Response evoked on 3 of the 10 neurons. Oscillatory 
le-it + 
patterns c.c. and 2-iw2tc.c. have been imprinted, with  _L 2 and 
 - 41 Hz, 2 - 63 Hz. During the learning phases the parameter a of kernel was 
tuned appropriately, i.e. a - 0.1 when imprinting  and a - 1.1 when imprinting 
imprinted pattern. 
The response elicited when two orthogonal patterns have been imprinted with the 
same frequency is shown in Fig. 2C. Let l-iwot + C.C. and 2-iot + C.C. denote 
the imprinted patterns, and -iWot + C.C. be the input to the trained network. In 
both linear and non-linear simulations the network responds vigorously(with high- 
amplitude oscillations) to the drive if  is in the subspace spanned by the imprinted 
patterns, and fails to respond appreciably if  is orthogonal to that plane. When 
the input pattern  is in the plane spanned by the stored patterns, the resonant 
response u also lies in this plane. However, while in the linear case the output is 
proportional to the input, in agreement with the analytical analysis, in the non- 
linear case there are preferred directions, in the stored pattern plane. The figure 
shows that, in case simulated here, there are three stable attractors: , , and 
the symmetric linear combination ( + 2)//). 
Finally we performed linear simulations storing two orthogonal patterns e -it + 
c.c. and 2e-i2t + c.c. with two different imprinting frequencies. Fig. 3 shows a 
good performance of the network in separating the basins of attraction in this case. 
The response to a linear combination of the two patterns, (a  + b2)e -i2t + c.c. 
is proportional to the part of the input whose imprinting frequency matches the 
current driving frequency. Linear combinations of the two imprinted patterns are 
not attractors if the two patterns do not share the same imprinting frequency. 
4 Summary and Discussion 
We have presented a model of learning for memory or input representations in neural 
networks with input-driven oscillatory activity. The model structure is an abstrac- 
tion of the hippocampus or the olfactory cortex. We propose a simple generalized 
Hebbian rule, using temporal-activity-dependent LTP and LTD, to encode both 
magnitudes and phases of oscillatory patterns into the synapses in the network. Af- 
ter learning, the model responds resonantly to inputs which have been learned (or, 
for networks which operate essentially linearly, to linear combinations of learned in- 
puts), but negligibly to other input patterns. Encoding both amplitude and phase 
enhances computational capacity, for which the price is having to learn both the 
excitatory-to-excitatory and the excitatory-to-inhibitory connections. Our model 
puts contraints on the form of the learning kernal A(-) that should be experime- 
nally observed, e.g., for small oscillation frequencies, it requires that the overall LTP 
dominates the overall LTD, but this requirement should be modified if the stored 
oscillations are of high frequencies. Plasticity in the excitatory-to-inhibitory connec- 
tions (for which experimental evidence and investigation is still scarce) is required 
by our model for storing phase locked but unsynchronous oscillation patterns. 
As for the Hopfield model, we distinguish two functional phases: (1) the learning 
phase, in which the system is clamped dynamically to the external inputs and (2) 
the recall phase, in which the system dynamics is determined by both the external 
inputs and the internal interactions. 
A special property of our model in the linear regime is the following interpolation 
capability: under a given oscillation frequency, once the system has learned a set of 
representation states, all other states in the subspace spanned by the learned states 
can also evoke vigorous responses. Hippocampal place cells could employ such a 
representation. Each cell has a localised "place field", and the superposition of 
activity of several cells wth nearby place fields can represent continuously-varying 
position. The locality of the place fields also means that this representation is 
conservative (and thus robust), in the sense that interpolation does not extend 
beyond the spatial range of the experienced locations or to locations in between 
two learned but distant and disjoint spatial regions. 
Of course, this interpolation property is not always desirable. For instance, in cat- 
egorical memory, one does not want inputs which are linear combinations of stored 
patterns to elicit responses which are also similar linear combinations. Suitable 
nonlinearity can (as we saw in the last section), enable the system to perform cate- 
gorization: one way involved storing different patterns (or, by implication, different 
classes of patterns) at different frequencies. For instance, in a multimodal area, 
"place fields" might be stored at one oscillation frequency, and (say) odor mem- 
ories at another. It seems likely to us that the brain may employ different kinds 
and degrees of nonlinearity in different areas or at different times to enhance the 
versatility of its computations. 
References 
[1] H Markram, J Lubke, M Frotscher, and B Sakmann, Science 275 213 (1997). 
[2] J C Magee and D Johnston, Science 275 209 (1997). 
[3] D Debanne, B H Gahwiler, and S M Thompson, J Physiol 507 237 (1998). 
[a] G Q Bi and M M Poo, J Neurosci 18 10464 (1998). 
[5] Z Li and J Hertz, Network: Computation in Neural Systems 11 83-102 (2000). 
[6] Z Li and J J Hopfield, Biol Cybern 61 379-92 (1989). 
[7] M E Hasselmo, Neural Comp 5 32-44 (1993). 
