PyTorch/XLA 1.11 release
Cloud TPUs now support the PyTorch 1.11 release, via PyTorch/XLA integration. The release has daily automated testing for the supported models: Torchvision ResNet, FairSeq Transformer and RoBERTa, HuggingFace GLUE and LM, and Facebook Research DLRM.
On top of the underlying improvements and bug fixes in PyTorch's 1.11 release, this release adds several features and PyTorch/XLA specified bug fixes.
New feature
- Enable asynchronous RNG seed sending by environment variable
XLA_TRANSFER_SEED_ASYNC
- Add a native torch.distributed backend
- Introduce a Eager debug mode by environment variable
XLA_USE_EAGER_DEBUG_MODE
- Add synchronous free Adam and AdamW optimizers for PyTorch/XLA:GPU AMP
- Add synchronous free SGD optimizers for PyTorch/XLA:GPU AMP
- linspace lowering
- mish lowering
- prelu lowering
- slogdet lowering
- stable sort lowering
- index_add with alpha scaling lowering
Bug fixes && improvements
- Improve
torch.var
performance and numerical stability on TPU - Improve
torch.pow
performance - Fix the incorrect output dtype when divide a f32 by a f64
- Fix the incorrect result of
nll_loss
when reduction = "mean" and whole target is equal to ignore_index