Topics: Tensorflow 2.0, TF Hub, Cloud TPU
TPU Type: v2.8 Tensorflow Version: 1.14
Machine Type: n1-standard-2 OS: Debian 9 Tensorflow Version: Came with tf-nightly. Manually installed Tensorflow 2.0 Beta
- Open Google Cloud Shell
ctpu up -tf-version 1.14
- If cloud bucket is not setup automatically, create a cloud storage bucket with the same name as TPU and the VM
- enable HTTP traffic for the VM instance
- SSH into the system
pip3 uninstall -y tf-nightly
pip3 install -r requirements.txt
export CTPU_NAME=<common name of the tpu, vm and bucket>
$ sudo -i
$ pip3 uninstall -y tf-nightly
$ pip3 install tensorflow==2.0.0-beta0
$ exit
$ sudo tensorboard --logdir gs://$CTPU_NAME/model_dir --port 80 &>/dev/null &
To view Tensorboard, Browse to the Public IP of the VM Instance
$ python3 --tpu $CTPU_NAME --use_tpu \
--modeldir gs://$CTPU_NAME/modeldir \
--datadir gs://$CTPU_NAME/datadir \
--logdir gs://$CTPU_NAME/logdir \
--num_steps 2000 \
--dataset horses_or_humans
Training Saves one single checkpoint at the end of training. This checkpoint can be loaded up later to export a SavedModel from it.
$ python3 --tpu $CTPU_NAME --use_tpu \
--modeldir gs://$CTPU_NAME/modeldir \
--datadir gs://$CTPU_NAME/datadir \
--logdir gs://$CTPU_NAME/logdir \
--dataset horses_or_humans \
--export_only \
--export_path modeldir/model
The trained model gets saved at gs://$CTPU_NAME/modeldir/model
by default if the path is not explicitly stated using --export_path