Interpretable embeddings from molecular simulations using Gaussian mixture variational autoencoders

Publication:
Interpretable embeddings from molecular simulations using Gaussian mixture variational autoencoders

dc.contributor.coauthor	Bereau, Tristan
dc.contributor.coauthor	Rudzinski, Joseph F.
dc.contributor.department	Graduate School of Sciences and Engineering
dc.contributor.kuauthor	Bozkurt, Yasemin
dc.contributor.schoolcollegeinstitute	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned	2024-11-09T23:59:10Z
dc.date.issued	2020
dc.description.abstract	Extracting insight from the enormous quantity of data generated from molecular simulations requires the identification of a small number of collective variables whose corresponding low-dimensional free-energy landscape retains the essential features of the underlying system. Data-driven techniques provide a systematic route to constructing this landscape, without the need for extensive a priori intuition into the relevant driving forces. In particular, autoencoders are powerful tools for dimensionality reduction, as they naturally force an information bottleneck and, thereby, a low-dimensional embedding of the essential features. While variational autoencoders ensure continuity of the embedding by assuming a unimodal Gaussian prior, this is at odds with the multi-basin free-energy landscapes that typically arise from the identification of meaningful collective variables. In this work, we incorporate this physical intuition into the prior by employing a Gaussian mixture variational autoencoder (GMVAE), which encourages the separation of metastable states within the embedding. The GMVAE performs dimensionality reduction and clustering within a single unified framework, and is capable of identifying the inherent dimensionality of the input data, in terms of the number of Gaussians required to categorize the data. We illustrate our approach on two toy models, alanine dipeptide, and a challenging disordered peptide ensemble, demonstrating the enhanced clustering effect of the GMVAE prior compared to standard VAEs. The resulting embeddings appear to be promising representations for constructing Markov state models, highlighting the transferability of the dimensionality reduction from static equilibrium properties to dynamics.
dc.description.indexedby	WOS
dc.description.indexedby	Scopus
dc.description.issue	1
dc.description.openaccess	YES
dc.description.publisherscope	International
dc.description.sponsoredbyTubitakEu	N/A
dc.description.sponsorship	Scientific and Technological Research Council of Turkey, TUBITAK- BIDEB, under the 2214-A programme
dc.description.sponsorship	Emmy Noether program of the Deutsche Forschungsgemeinschaft (DFG)
dc.description.sponsorship	long program Machine Learning for Physics and the Physics of Learning at the Institute for Pure and Applied Mathematics (IPAM) The authors thank Kiran H Kanekal and Omar Valsson for critical reading of the manuscript. JFR is grateful to the BiGmax consortium and participants of the BiGmax Big Data Summer School for insightful discussions. YBV acknowledges foreign collaborative research study support by The Scientific and Technological Research Council of Turkey, TUBITAK- BIDEB, under the 2214-A programme. TB acknowledges financial support by the Emmy Noether program of the Deutsche Forschungsgemeinschaft (DFG) and the long program Machine Learning for Physics and the Physics of Learning at the Institute for Pure and Applied Mathematics (IPAM).
dc.description.volume	1
dc.identifier.doi	10.1088/2632-2153/ab80b7
dc.identifier.eissn	2632-2153
dc.identifier.issn	N/A
dc.identifier.quartile	Q1
dc.identifier.scopus	2-s2.0-85087592698
dc.identifier.uri	https://doi.org/10.1088/2632-2153/ab80b7
dc.identifier.uri	https://hdl.handle.net/20.500.14288/15571
dc.identifier.wos	660848300001
dc.keywords	Variational autoencoders
dc.keywords	Dimensionality reduction
dc.keywords	Clustering
dc.keywords	Markov state models
dc.keywords	Molecular dynamics simulations
dc.language.iso	eng
dc.publisher	IOP Publishing Ltd
dc.relation.ispartof	Machine Learning-Science and Technology
dc.subject	Computer science
dc.subject	Artificial intelligence
dc.subject	Computer science
dc.title	Interpretable embeddings from molecular simulations using Gaussian mixture variational autoencoders
dc.type	Journal Article
dspace.entity.type	Publication
local.contributor.kuauthor	Bozkurt, Yasemin
local.publication.orgunit1	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit2	Graduate School of Sciences and Engineering
relation.isOrgUnitOfPublication	3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery	3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isParentOrgUnitOfPublication	434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery	434c9663-2b11-4e66-9399-c863e2ebae43

Collections

Publications without Fulltext

Publication: Interpretable embeddings from molecular simulations using Gaussian mixture variational autoencoders

Files

Collections

Publication:
Interpretable embeddings from molecular simulations using Gaussian mixture variational autoencoders