One-dimensional population dynamics in LIP

In my last post, I wrote about low-dimensional population dynamics in higher cortical areas. A particularly striking example of this type of low-dimensional dynamics is discussed in a beautiful paper by Ganguli et al. (2008).

They set out to explain a curious phenomenon observed in a go-nogo task (Bisley & Goldberg, 2003; Bisley & Goldberg, 2006). Briefly, the task in these papers was to plan a saccade to a target location and then either execute or hold off the saccade depending on a probe signal. Planning a saccade to a location enhances sensitivity at that location. This is a top-down attentional effect. In addition, in half of the trials they flashed a task-irrelevant distracter (after the target) and observed that sensitivity was transiently enhanced at the distracter location, suggesting that the distracter, although task-irrelevant, nonetheless involuntarily attracted attention. This is a bottom-up attentional effect. This bottom-up attentional enhancement was only temporary: the attentional advantage switched back to the saccade target location about 700 ms after the distracter was flashed.

They recorded from single neurons in LIP while monkeys performed this task and found that neural activity in single neurons correlated nicely with the sensitivity pattern observed in the task. In particular, when the target location was inside the receptive field of a neuron, there was a delay period activity above the spontaneous (baseline) level. When the distracter was inside the receptive field of the neuron, on the other hand, there was a large transient increase in activity which then quickly decayed back to the spontaneous level. Visually, the response of a single representative neuron looked like this:

Response of a single neuron when the target was in its receptive field (blue), or when the distracter was in its receptive field (red).

Response of a single neuron when the target was in its receptive field (blue), or when the distracter was in its receptive field (red). Figure from Ganguli et al. (2008).

The delay activity for target location (blue) was more or less constant, so it can be represented by a single number, D. The transient activity in response to the distracter (red) can be fit by an exponential V \exp(-kt) (black line) with maximum activity V and decay constant k. Now the two puzzling observations were as follows:

1) When the activity in response to the target and the activity in response to the distracter were overlaid on top of each other (as in the figure above), the crossing time (i.e. the time when the two curves intersected with each other) was very similar for different neurons despite a large amount of variation in the parameters D, V and k across neurons, i.e. there was a common crossing time t_c such that V_i \exp(-k_i t_c) \approx D_i for all neurons i. Note that because of the heterogeneity in D, V and k, there is no a priori reason to expect the neurons to have the same crossing time.

2) Even more remarkably, neurons with completely different receptive fields also had very similar t_c values.

How do Ganguli et al. (2008) explain these observations? Here’s their basic idea. Imagine the vector of neural responses tracing out a trajectory in an N-dimensional space while the monkey is performing a single trial of the task. Let’s denote average spontaneous activity of neurons by the vector S. Then V (transient activity in response to the distracter), D (delay activity) and S (spontaneous activity) can be thought of as points in this N-dimensional space, all three being vectors of firing rates.

Observation 1 above then just means that as the neural trajectory travels from V to S during a trial with a distracter, it crosses the point D. In principle, the shape of the trajectory can be arbitrarily complex (the only requirement is that it cross D on its way to S), but they are making the further assumption that it has a particularly simple shape. Specifically, they assume that after an initial high-dimensional dynamics, the trajectory quickly settles into a one-dimensional subspace (i.e. a straight line). They make this assumption because (i) it makes a number of predictions that are borne out by the data (I won’t go into these predictions here); (ii) it makes it easy to explain the common crossing time t_c despite a large degree of heterogeneity in D, V and k among neurons and for patches with different receptive fields. Note that one-dimensional dynamics implies, among other things, that the single direction in which the trajectory travels should correspond to the vector S (or equivalently to D) and that S and D should just be scaled versions of each other.

What is the condition for one-dimensional dynamics? Let’s think about a small patch of neurons in LIP with similar receptive fields. Assuming linear (but not necessarily one dimensional) dynamics for this patch of neurons, properties of the dynamics are determined entirely by the eigenvalues of the recurrent connectivity matrix between the neurons, W. The dynamics will be one-dimensional only if there is a single dominant eigenvalue close to 1 and the rest of the eigenvalues are all close to 0. In this case, all the eigenmodes with eigenvalues close to 0 will quickly decay and the system will slowly evolve along the eigenmode with the dominant eigenvalue. It turns out that a large class of connectivity matrices satisfy this property. In particular, if W is sparse and random with net excitatory connectivity, there is a single dominant eigenvalue and the remaining eigenvalues cluster around 0.

Suppose, for example, each connection w_{ij} is either 0 (with probability 1-p) or a random number divided by N (number of neurons, this normalization is necessary to ensure stability) where the random number is drawn from a normal distribution with mean \mu_w and variance \sigma^2_w. Then it turns out that W has a single dominant eigenvalue near p\mu_w and a disk of eigenvalues centered at 0 with a radius of approximately R = \sqrt{\frac{p\sigma_w^2 + \mu^2_w p(1-p)}{N}}. Below is a picture of the eigenvalue spectra of 3 random matrices W for different N values (p=0.1, \mu_w = 8 and \sigma^2_w = 4 in all cases). Note how the cloud of eigenvalues around 0 shrinks in size due to the \sqrt{N} factor in the denominator in R:

Eigenvalue spectra of 3 random matrices for different N values.

Eigenvalue spectra of 3 random matrices for different N values.

Now, the great thing about this is that this type of eigenvalue spectrum doesn’t depend on the detailed structure of W. As long as W is sparse, and has net excitatory connectivity, the eigenvalue spectrum will look like the ones above and the dynamics is then guaranteed to be one-dimensional. Heuristically, Ganguli et al. (2008) motivate this result by thinking of W as the perturbation of the mean matrix M (which is a uniform matrix with all entries equal to (1/N) p\mu_w), by a random J with entries drawn randomly from a distribution with zero mean and variance \sigma^2 = (p \sigma^2_w + \mu^2_w p (1-p))/N^2. M has a single non-zero eigenvalue at p\mu_w with remaining eigenvalues all equal to 0. Roughly speaking this non-zero eigenvalue corresponds to the dominant eigenvalue in W. From Girko’s circular law, J, on the other hand, has eigenvalues uniformly distributed within a disk centered at 0 and with a radius R. Again, roughly speaking, these correspond to the cloud of eigenvalues around 0 in W.

If we now calculate the crossing time t_c (see the supplementary material), it can be seen that roughly speaking t_c only depends on p and \mu_w and doesn’t depend on the detailed structure of W. This means that if these gross statistical properties of the connectivity are similar in two different patches of neurons with different receptive fields, the crossing-time will be similar despite differences in their receptive fields and their precise connectivity. This explains the observation 2 above. Intuitively, the crossing time is determined by the dynamics of the dominant eigenmode, which is in turn independent of the detailed structure of W as we have seen.

This is such a general result that one would expect to see one-dimensional (or at least low-dimensional) dynamics not just in higher cortical areas, but everywhere in cortex. However, a possible caveat is that, especially in sensory areas, even in small patches of cortex, connectivity doesn’t seem to be completely random (e.g. Song et al., 2005): there are things like network motifs, distance-dependent connectivity or preferential connectivity between neurons with similar stimulus preferences. This makes the matrix W moreĀ  structured (than the ones considered by Ganguli et al., 2008) in which case the eigenvalue spectrum will be different, and the dynamics is no longer guaranteed be one-dimensional or even low-dimensional. There is, in fact, some recent work on dynamics in recurrent networks with (at least partially) structured connectivity (e.g. Litwin-Kumar & Doiron, 2012; Ahmadian, Fumarola & Miller, 2013). I’m making a mental note here to write about this stuff in a later post (hopefully soon).