AttnGan is an algorithm that visualizes text input. It was developed by Microsoft’s Deep Learning Technology Center, and while the project has a noble goal with its text-visualization algorithm, the results are often just plain bizarre.
AttnGAN TRANS architecture has three variants AttnGAN BERT, AttnGAN XL and AttnGAN GPT.
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks by Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong
https://github.com/taoxugit/AttnGAN
Pytorch implementation for reproducing AttnGAN results in the paper AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks by Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He. (This work was performed when Tao was an intern with Microsoft Research).
Dependencies
python 2.7
Pytorch
PyTorch implementation of a modified (style-based) AttnGAN [code] architecture that incorporates the strong latent space control provided by StyleGAN*. This architecture enables one to not only synthesize an image from an input text description, but also move that image in a desired disentangled dimension to alter its structure at different scales (from high-level coarse styles such as pose to fine-grained styles such as background lighting).
GIF OF LATENT SPACE INTERPOLATION EXAMPLE COMING SOON
This implementation also provides one with the option to utilize state-of-the-art transformer-based architectures from huggingface’s transformers library as the text encoder for Style-AttnGAN (currently only supports GPT-2). Among other things, utilization of these transformer-based encoders significantly improves image synthesis when the length of the input text sequence is large.
Original AttnGAN paper: AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks by Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He. (This work was performed when Tao was an intern with Microsoft Research). Thank you for your brilliant work.
AttnGAN architecture from arxiv.org/abs/1711.10485 StyleGAN generator architecture from arxiv.org/abs/1812.04948
Copied from LICENSE file (MIT License) for visibility:
Copyright for portions of project Style-AttnGAN are held by Tao Xu, 2018 as part of project AttnGAN. All other copyright for project Style-AttnGAN are held by Sidhartha Parhi, 2020. All non-data files that have not been modified by Sidhartha Parhi include the copyright notice “Copyright (c) 2018 Tao Xu” at the top of the file.
Instructions:
Dependencies
python 3.7+
PyTorch 1.0+