Diffusion maps | Siva K. Dasetty

Diffusion maps is a nonlinear dimensionality reduction technique that can enable access to the intrinsic manifold of the data by fusing the datapoints based on their local connectivity.[1-3] In brief, it relies on a similarity metric ($S$) to define the proximity of the datapoints for determining their local connectivity in the high dimensional space and obtain a map of a meaningful global structure of the manifold. A meaningful manifold can be interpreted as a low-dimensional subspace with the local relationships in the data preserved along with the dynamics in the configurational phase space such that the neighboring configurations are connected to each other by a diffusive process or a random walk.[2]

To determine diffusion maps, first a kernel such as Gaussian kernel is applied to all the pairwise distances $d_{ij}$ between the data points using the defined similarity metric $S$ to construct a symmetric $N\times N$ similarity matrix $\mathbf{A}$ with elements $A_{ij}$, where N is the number of data points. For Gaussian kernel, $A_{ij}=e^{-d_{ij}^{2}/2\varepsilon}$, where the choice of the kernel bandwidth $\varepsilon$ determines the local neighborhood radius in which the data points can diffuse or are considered to be connected to each other. The resulting similarity matrix $\mathbf{A}$ that is both real and symmetric is then normalized to obtain the right stochastic Markov transition matrix $\mathbf{M}$ with elements $M_{ij} = \frac{A_{ij}}{D_{ii}}$, which captures the probability of diffusing from one point to another point. It is obtained by dividing $A_{ij}$ in the similarity matrix $\mathbf{A}$ with the elements in the diagonal matrix $\mathbf{D}$, where each diagonal element $D_{ii}$ corresponds to the sum of elements in each row $i$ i.e., $\sum_{j}A_{ij}$ in the similarity matrix. Diffusion maps can then obtained by eigenvalue decomposition of the right stochastic Markov transition matrix $\mathbf{M}$. In practice, the adjoint matrix $\mathbf{M_{s}}$ = $\mathbf{D}^{1/2}\mathbf{M}\mathbf{D}^{-1/2}$ is used for eigenvalue decomposition because it is symmetric and therefore, diagonalizable.[1,2] In addition, it has real eigenvalues $\lambda_{i} \in (0,1]$ or its ordered eigenvalues are $\lambda_{1} = 1 \geq \lambda_{2} \geq \lambda_{3} … > \lambda_{N} \approx 0 $ since $\mathbf{M}$ is a positive-semidefinite matrix and a right stochastic Markov transition matrix.

The eigenvector $\overrightarrow{\psi_{1}}$ corresponding to the leading ordered eigenvalue $\lambda_{1}=1$ is a trivial all-ones $\overrightarrow{1}$ vector and it signifies the stable equilibrium transition since $\mathbf{M}$ is a Markov transition matrix.[2,3] The remaining k eigenvectors $\overrightarrow{\psi_{2}}, \overrightarrow{\psi_{3}}, …, \overrightarrow{\psi_{k}}$ corresponding to the leading k ordered eigenvalues $\lambda_{2}, \lambda_{3}, …, \lambda_{k}$ capture the main transitions or diffusion modes and constitute the leading k dMap coordinates. In contrast to the conventional linear dimensionality reduction technique principal component analysis, dMap coordinates are not projections of the input high-dimensional configurational data space onto eigenvectors but rather the eigenvectors themselves. Alternatively, dMaps are the embeddings of the input data points into the leading k nontrivial eigenvectors of $\mathbf{M_{s}}$. This additionally implies explicit mapping of high-dimensional data space to dMaps is not directly accessible but can be learnt via universal function approximators[4]. Further discussion on the interpretation of dMap coordinates and its connection to the leading eigenfunctions of the backward Fokker-Plank operator describing a continuous diffusion process over the manifold from dMaps can be found in Ferguson et al[3].

Implementation of dMaps is available in:

[1] Python implementation via MDAnalysis: https://docs.mdanalysis.org/2.0.0/documentation_pages/analysis/diffusionmap.html

[2] C++ implementation: https://github.com/hsidky/dmaps

[3] JAX powered diffusion maps and RMSD calculations in Python: https://github.com/Ferg-Lab/dMap_JAX

References

[1] Nadler B, Lafon S, Kevrekidis I, Coifman R. Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators. Advances in neural information processing systems. 2005;18.

[2] Ferguson AL, Panagiotopoulos AZ, Debenedetti PG, Kevrekidis IG. Systematic determination of order parameters for chain dynamics using diffusion maps. Proceedings of the National Academy of Sciences. 2010 Aug 3;107(31):13597-602.

[3] Ferguson AL, Panagiotopoulos AZ, Debenedetti PG, Kevrekidis IG. Systematic determination of order parameters for chain dynamics using diffusion maps. Proceedings of the National Academy of Sciences. 2010 Aug 3;107(31):13597-602.

[4] Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural networks. 1989 Jan 1;2(5):359-66.