### The beauty of circulant matrices (and of Toeplitz matrices, to a lesser extent)

A circulant matrix is a matrix where each row (column) is a cyclic shift of the preceding row (column):

$latex C = \begin{bmatrix}

c_0 & c_1 & c_2 & \cdots & c_{n-1} \\

c_{n-1} & c_0 & c_1 & \cdots & c_{n-2} \\

\vdots & \vdots & \vdots & \ddots & \vdots \\

c_1 & c_2 & c_3 & \cdots & c_0

\end{bmatrix} $

Note that a single row (or column) completely determines the entire matrix.

The beauty of circulant matrices lies in the fact that they are all diagonalized in the Fourier basis, that is, every circulant matrix can be decomposed as:

where is the unitary discrete Fourier transform matrix, is its conjugate transpose and is the diagonal matrix of eigenvalues of . The eigenvalues can be found by taking the discrete Fourier transform of the first row of .

Here are two immediate consequences of the above fact:

- The inverse of a circulant matrix is also circulant. Indeed, if , then .
- The sum or product of two circulant matrices is also circulant: if and , then and .

So, it’s extremely easy to work with circulant matrices. Unfortunately, these properties do not generalize to Toeplitz matrices, i.e. matrices with constant diagonals. However, it turns out that if one thinks of a sequence of Toeplitz matrices defined by the entries , provided certain conditions on are satisfied, one can find (or construct) a corresponding sequence of circulant matrices such that and are asymptotically equivalent, i.e. roughly speaking they behave similarly in the limit of large and thus properties similar (but not identical, for example will not diagonalize in the Fourier basis even for large ) to the ones mentioned above for circulant matrices also hold for .

This result may be relevant to computational neuroscience, because in some cases, one wants to perform operations on covariance matrices (for example, inverting the covariance matrix of neural responses for Fisher information calculations) that are Toeplitz, but not necessarily circulant: for example, when correlations between neurons decay as a function of the difference between their preferred stimuli in non-circular stimulus dimensions (e.g. spatial frequency or spatial location). The fact that and are only asymptotically equivalent (i.e., in the large limit) is not a problem, because that’s the limit one is interested in in most cases in computational neuroscience (the limit of a large number of neurons).

Additional details about this remarkable result can be found in this nice (and very accessible) tutorial by Robert Gray.