Pytorch implementation of Multimodal Unsupervised Image-to-Image Translation.
My impression on this paper differed a lot from the first time when I read this paper.
- 8+ models took all of my memory, so I had to train with batch size < 4.
- 8 latent variables vs 256 * H * W content variables?? Really???
- Seems like ‘Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization’ should take credit for good quality of samples
- Would only work in image. Need to find other model for audio.
- DATASET PLEASE!!
Subscribe via RSS