conditional WGANs

Generative adversarial networks or GANs are one of the popular generative modeling techniques. It’s improved variant, Wasserstein generative adversarial networks or WGAN involves simultaneous training of two neural network functions — generator $\mathcal{G}(\mathbf{z})$ and the critic $\mathcal{C}(\mathbf{x})$, where $\mathbf{z}$ is a $d$-dimensional random noise vector $\mathbf{z}\sim \mathbb{P}_{z}(\mathbf{z})$ and $\mathbf{x}$ represents the high dimensional datapoint that we wish to generate such as a molecule.[1] While the objective of the generator $\mathcal{G}(\mathbf{z})$ in a typical scenario is to learn the distribution of the data e.g. molecular configurations $\mathbf{x}$, the critic $\mathcal{C}(\mathbf{x})$ is trained to learn the Wasserstein distance between the sample from the training data such as that from configurational phase space sampled via molecular simulation and that generated from $\mathcal{G}(\mathbf{z})$. By training the two networks simultaneously in a process called adversarial training, $\mathcal{G}(\mathbf{z})$ and $\mathcal{C}(\mathbf{x})$ can in principle approach an equilibrium state (Nash equilibrium) akin to a two player game where further improvement of the performance of both the functions can be difficult. This could result in a critic $\mathcal{C}(\mathbf{x})$ with improved accuracy in scoring the Wasserstein distance between a real molecular configuration $\mathbf{x}$ and the synthetic sample $\tilde{\mathbf{x}}$ generated by generator $\mathcal{G}(\mathbf{z})$, which subsequently results in a generator $\mathcal{G}(\mathbf{z})$ that could generate samples $\tilde{\mathbf{x}}$ resembling the true molecular configuration $\mathbf{x}$.[2] In practice, the neural network parameters in the generator $\mathcal{G}(\mathbf{z})$ are optimized to generate synthetic sample $\tilde{\mathbf{x}}$ similar to that of a real sample $\mathbf{x}$ using a random $d$-dimensional input noise vector $\mathbf{z}$. In contrast, the neural network parameters in critic $\mathcal{C}(\mathbf{x})$ are optimized to learn the Wasserstein distance between the input sample $\mathbf{x}$ and the synthetic sample $\tilde{\mathbf{x}}$ generated by the generator $\mathcal{G}(\mathbf{z})$.

For a brief overview on conditional variant of WGAN and its application to sample intermediate conformations of large multi-molecular systems, check out “Data-driven prediction of αIIbβ3 integrin activation paths using manifold learning and deep generative modeling” and its source code at https://github.com/Ferg-Lab/integrin_molgen.

References

[1] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. Advances in neural information processing systems. 2014;27.

[2] Goodfellow I. Nips 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160. 2016 Dec 31.