Open Access iconOpen Access



A Novel Mixed Precision Distributed TPU GAN for Accelerated Learning Curve

Aswathy Ravikumar, Harini Sriraman*

School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, 600127, India

* Corresponding Author: Harini Sriraman. Email: email

Computer Systems Science and Engineering 2023, 46(1), 563-578.


Deep neural networks are gaining importance and popularity in applications and services. Due to the enormous number of learnable parameters and datasets, the training of neural networks is computationally costly. Parallel and distributed computation-based strategies are used to accelerate this training process. Generative Adversarial Networks (GAN) are a recent technological achievement in deep learning. These generative models are computationally expensive because a GAN consists of two neural networks and trains on enormous datasets. Typically, a GAN is trained on a single server. Conventional deep learning accelerator designs are challenged by the unique properties of GAN, like the enormous computation stages with non-traditional convolution layers. This work addresses the issue of distributing GANs so that they can train on datasets distributed over many TPUs (Tensor Processing Unit). Distributed learning training accelerates the learning process and decreases computation time. In this paper, the Generative Adversarial Network is accelerated using the distributed multi-core TPU in distributed data-parallel synchronous model. For adequate acceleration of the GAN network, the data parallel SGD (Stochastic Gradient Descent) model is implemented in multi-core TPU using distributed TensorFlow with mixed precision, bfloat16, and XLA (Accelerated Linear Algebra). The study was conducted on the MNIST dataset for varying batch sizes from 64 to 512 for 30 epochs in distributed SGD in TPU v3 with 128 × 128 systolic array. An extensive batch technique is implemented in bfloat16 to decrease the storage cost and speed up floating-point computations. The accelerated learning curve for the generator and discriminator network is obtained. The training time was reduced by 79% by varying the batch size from 64 to 512 in multi-core TPU.


Cite This Article

APA Style
Ravikumar, A., Sriraman, H. (2023). A novel mixed precision distributed TPU GAN for accelerated learning curve. Computer Systems Science and Engineering, 46(1), 563-578.
Vancouver Style
Ravikumar A, Sriraman H. A novel mixed precision distributed TPU GAN for accelerated learning curve. Comput Syst Sci Eng. 2023;46(1):563-578
IEEE Style
A. Ravikumar and H. Sriraman, "A Novel Mixed Precision Distributed TPU GAN for Accelerated Learning Curve," Comput. Syst. Sci. Eng., vol. 46, no. 1, pp. 563-578. 2023.

cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 768


  • 623


  • 0


Share Link