Code for the paper:
Locally adaptive activation functions with slope recovery for deep and physics-informed neural networks
The paper is available at: https://royalsocietypublishing.org/doi/10.1098/rspa.2020.0334
Ameya D. Jagtap, Kenji Kawaguchi, George Em Karniadakis
We propose two approaches of locally adaptive activation functions namely, layer-wise and neuron-wise locally adaptive activation functions, which improve the performance of deep and physics-informed neural networks. The local adaptation of activation function is achieved by introducing a scalable parameter in each layer (layer-wise) and for every neuron (neuron-wise) separately, and then optimizing it using a variant of stochastic gradient descent algorithm. In order to further increase the training speed, an activation slope-based slope recovery term is added in the loss function, which further accelerates convergence, thereby reducing the training cost. On the theoretical side, we prove that in the proposed method, the gradient descent algorithms are not attracted to sub-optimal critical points or local minima under practical conditions on the initialization and learning rate, and that the gradient dynamics of the proposed method is not achievable by base methods with any (adaptive) learning rates. We further show that the adaptive activation methods accelerate the convergence by implicitly multiplying conditioning matrices to the gradient of the base method without any explicit computation of the conditioning matrix and the matrix–vector product. The different adaptive activation functions are shown to induce different implicit conditioning matrices. Furthermore, the proposed methods with the slope recovery are shown to accelerate the training process.
@article{Jagtap2020,
author = {Ameya D. Jagtap and Kenji Kawaguchi and George Em Karniadakis},
title = {Locally adaptive activation functions with slope recovery for deep and physics-informed neural networks},
journal = {Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences},
year = {2020},
doi = {10.1098/rspa.2020.0334},
}
This repository contains a simplified version of the neuron-wise locally adaptive activation functions (N-LAAF) applied to physics-informed neural networks (PINNs) to solve 1-D Poisson equation with scalar right-hand-side function on aribtrary domain with arbitrary Dircihlet boundary conditions as a toy example.
PINNs are relatively new concept and a very efficient method for solving forward and inverse (partial) differential and integro-differential equations involving noisy, sparse and multi-fidelity data. The main feature of PINNs is that they incorporate all prior information on the underlying dynamics of the governing equation, experimental/measured data, initial/boundary conditions, etc., into the loss function thereby recast the original problem into an optimization problem. Mathematical details of PINNs are available in the seminal 2019 paper by Raissi et al.
The key idea behind adaptive activation functions for PINNs is introduced in its infant version in the paper by Jagtap et al., where authors introduce a scalable parameter in an activation function, which can be optimized by using any optimization algorithm. Mathematically, the adaptive scalabale parameter affects the slope of activation functions, thus increasing the learning process by altering the loss landscape of the neural network, especially during the initial training period. Since this approach uses only a single learnable paramater, such activation functions are known as globally adaptive activation functions (GAAFs).
The same authors introduce the idea of multiple scalable parameters applied layer-wise or even neuron-wise where such locally defined activation slopes are able to further improve the performance of the network but at the same time the parameter space grows larger.
Even though the parameter space is considerably larger compared to the fixed activation function (additional
To run this code you need the following:
-
clone the repository to your local machine:
$ git clone https://github.com/antelk/locally-adaptive-activation-functions.git
and enter the local repository:
$ cd locally-adaptive-activation-functions
-
install
Python3
(Python3.8.5
is used for the development) and other requirements given in environement.yml file, preferably in a separate environment to avoid dependancy-related issues:$ conda env create -f environment.yml
-
instead, if you have a CUDA GPU, you want to install PyTorch supporting cudatoolkit manually. E.g., using
conda
installer, run the following command:$ conda install pytorch cudatoolkit=11.0 -c pytorch
for the latest supported cudatoolkit using Python3.
Python3.9
users will need to add-c=conda-forge
flag, for details go to the official installation webpage.
The code is tested for both Windows 10 and GNU/Linux operating systems. Unfortunatelly, there are no unit tests that support the previous claim :D
poisson1d.py
is the main (and the only) script to run the experiments. utils.py
holds the AdaptiveLinear
class that implements the neural network layer with LAAFs.
Arguments that could be passed into the script are listed as follows:
$ python poisson1d.py --help
usage: poisson1d.py [-h] [--cuda] [--domain DOMAIN DOMAIN]
[--boundary_conditions BOUNDARY_CONDITIONS BOUNDARY_CONDITIONS]
[--rhs RHS] [--n_layers N_LAYERS] [--n_units N_UNITS]
[--activation ACTIVATION] [--optimizer {bfgs,sgd,adam}]
[--n_epochs N_EPOCHS] [--batch_size BATCH_SIZE] [--linspace]
[--learning_rate LEARNING_RATE]
[--dropout_rate DROPOUT_RATE] [--apply_mcdropout]
[--adaptive_rate ADAPTIVE_RATE]
[--adaptive_rate_scaler ADAPTIVE_RATE_SCALER]
[--save_fig SAVE_FIG]
optional arguments:
-h, --help show this help message and exit
--cuda Use CUDA GPU for training if available
--domain DOMAIN DOMAIN
Boundaries of the solution domain
--boundary_conditions BOUNDARY_CONDITIONS BOUNDARY_CONDITIONS
Boundary conditions on boundaries of the domain
--rhs RHS Right-hand-side forcing function
--n_layers N_LAYERS The number of hidden layers of the neural network
--n_units N_UNITS The number of neurons per hidden layer
--activation ACTIVATION
activation function
--optimizer {bfgs,sgd,adam}
Optimization procedure
--n_epochs N_EPOCHS The number of training epochs
--batch_size BATCH_SIZE
The number of data points for optimization per epoch
--linspace Space the batch of data linearly, otherwise random
--learning_rate LEARNING_RATE
Learning rate applied for gradient based optimizers
--dropout_rate DROPOUT_RATE
Dropout regularization rate
--apply_mcdropout Apply MCdropout for uncertainty quantification
--adaptive_rate ADAPTIVE_RATE
Add additional adaptive rate parameter to activation
function
--adaptive_rate_scaler ADAPTIVE_RATE_SCALER
Apply constant scaler to the adaptive rate
--save_fig SAVE_FIG Save figure with specified name
PINN with an input layer,
$ python poisson1d.py --cuda --domain 0 1 --boundary_conditions -1 3 --rhs -7 --n_layers 3 --n_units 100 --activation tanh --optimizer sgd --n_epochs 1000 --batch_size 32 --linspace --learning_rate 1e-3 --save_fig experiment_1
Unlike the previous experiment, here the batch contains
$ python poisson1d.py --cuda --domain 0 1 --boundary_conditions -1 3 --rhs -10 --n_layers 3 --n_units 50 --activation tanh --optimizer adam --n_epochs 1000 --batch_size 101 --learning_rate 1e-3 --save_fig experiment_2
Repeating the previous experiment only instead by applying adaptive activation functions and slope-recovery term into the loss function.
$ python poisson1d.py --cuda --domain 0 1 --boundary_conditions -1 3 --rhs -10 --n_layers 3 --n_units 50 --activation tanh --optimizer adam --n_epochs 1000 --batch_size 101 --learning_rate 1e-3 --adaptive_rate 0.1 --adaptive_rate_scaler 10 --save_fig experiment_3
Uncertainty quantification using Monte Carlo dropout procedure.
For details on the method check the seminal paper by Gal and Ghahramani.
PINN with
$ python poisson1d.py --cuda --domain 0 1 --boundary_conditions -1 3 --rhs -10 --n_layers 3 --n_units 100 --activation tanh --optimizer bfgs --n_epochs 50 --batch_size 32 --linspace --dropout_rate 0.01 --apply_mcdropout --save_fig experiment_4