The beauty of circulant matrices (and of Toeplitz matrices, to a lesser extent)

by Emin Orhan

A circulant matrix is a matrix where each row (column) is a cyclic shift of the preceding row (column):

C = \begin{bmatrix}    c_0 & c_1 & c_2 & \cdots & c_{n-1} \\    c_{n-1} & c_0 & c_1 & \cdots & c_{n-2} \\    \vdots & \vdots & \vdots & \ddots & \vdots \\    c_1 & c_2 & c_3 & \cdots & c_0    \end{bmatrix}

Note that a single row (or column) completely determines the entire matrix.

The beauty of circulant matrices lies in the fact that they are all diagonalized in the Fourier basis, that is, every circulant matrix C can be decomposed as:

C = U \Psi U^{\ast}

where U is the unitary discrete Fourier transform matrix, U^{\ast} is its conjugate transpose and \Psi is the diagonal matrix of eigenvalues of C. The eigenvalues can be found by taking the discrete Fourier transform of the first row of C.

Here are two immediate consequences of the above fact:

  1. The inverse of a circulant matrix C is also circulant. Indeed, if C = U \Psi U^{\ast}, then C^{-1} = U \Psi^{-1} U^{\ast}.
  2. The sum or product of two circulant matrices is also circulant: if C = U \Psi U^{\ast} and D = U \Phi U^{\ast}, then C+D = U (\Psi +\Phi) U^{\ast} and CD = DC = U \Psi \Phi U^{\ast}.

So, it’s extremely easy to work with circulant matrices. Unfortunately, these properties do not generalize to Toeplitz matrices, i.e. matrices with constant diagonals. However, it turns out that if one thinks of a sequence of Toeplitz matrices T_n defined by the 2n - 1 entries t_{-n+1:n-1}, provided certain conditions on T_n are satisfied, one can find (or construct) a corresponding sequence of circulant matrices C_n such that C_n and T_n are asymptotically equivalent, i.e. roughly speaking they behave similarly in the limit of large n and thus properties similar (but not identical, for example T_n will not diagonalize in the Fourier basis even for large n) to the ones mentioned above for circulant matrices also hold for T_n.

This result may be relevant to computational neuroscience, because in some cases, one wants to perform operations on covariance matrices (for example, inverting the covariance matrix of neural responses for Fisher information calculations) that are Toeplitz, but not necessarily circulant: for example, when correlations between neurons decay as a function of the difference between their preferred stimuli in non-circular stimulus dimensions (e.g. spatial frequency or spatial location). The fact that C_n and T_n are only asymptotically equivalent (i.e., in the large n limit) is not a problem, because that’s the limit one is interested in in most cases in computational neuroscience (the limit of a large number of neurons).

Additional details about this remarkable result can be found in this nice (and very accessible) tutorial by Robert Gray.