⌛Archives ⚓ Tags

  • How to Train Really Large Models on Many GPUs?

    [PLACE-HOLDER POST, COPYRIGHT LILIAN WENG] How to train large and deep neural networks is challenging, as it demands a large amount of GPU memory and a long horizon of training time. This post reviews several popular training parallelism paradigms, as well as a variety of model architecture and memory saving designs to make it possible to train very large neural networks across a large number of GPUs.

  • lalalalala

    [PLACE-HOLDER POST, COPYRIGHT LILIAN WENG] How to train large and deep neural networks is challenging, as it demands a large amount of GPU memory and a long horizon of training time. This post reviews several popular training parallelism paradigms, as well as a variety of model architecture and memory saving designs to make it possible to train very large neural networks across a large number of GPUs.