This paper wanted to catch non-linear dynamics of the object in video.


KVAE(Kalman Variational Autoencoder) combined Kalman filter with VAE to model dynamic latent variables. image Linear Gaussian state space models are used to model Kalman filter and stable latent variables of the variational autoencoder. Matrices are the state transition, control and emission matrices at time t and Q and R are the covariances matrices of process and measurement noise. Using Kalman filter, we can estimate the and exactly. In generative process, joint density of KVAE factorizes as . In inference process, and are learned to maxmize the log likelihood . Since we can estimate the and exactly, the variational lowerbound can be rewritten ad below. This lowerbound can be estimated through Monte Carlo method. represent the dynamics but it may not always transform linearly. Therefore, this paper suggests Dynamics parameter network which linearly combines the with all the previous weighted by estimated using LSTM. image


KVAE performed better in imputing missing data in Bouncing ball and got higher ELBO in Pendulum experiment.


Good to know about Kalman filter.

Fraccaro, Marco, et al. “A disentangled recognition and nonlinear dynamics model for unsupervised learning.” Advances in Neural Information Processing Systems. 2017.