CycleGAN and Pix2Pix on Linux: Setup, Training, and Testing Guide

This comprehensive guide outlines the process of setting up, training, and testing CycleGAN and Pix2Pix models on Linux systems, including usage in high-performance computing (HPC) environments.

Prerequisites

Linux operating system (Ubuntu 18.04 or later recommended)
NVIDIA GPU with CUDA support
Internet connection
Sudo privileges

Environment Setup

Install CUDA and cuDNN:
- Visit the NVIDIA CUDA Toolkit Archive
- Install CUDA and cuDNN following official instructions

Install Anaconda:

wget https://repo.anaconda.com/archive/Anaconda3-2023.09-0-Linux-x86_64.sh
bash Anaconda3-2023.09-0-Linux-x86_64.sh
source ~/.bashrc

Create and activate a virtual environment:

conda create -n pix2pix python=3.8
conda activate pix2pix

Install PyTorch:

conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

Clone the repository:

git clone https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix.git
cd pytorch-CycleGAN-and-pix2pix

Install dependencies:
```
pip install -r requirements.txt
```

Data Preparation

Organize your dataset in datasets/your_dataset_name:
- For Pix2Pix: Use /test, /train, and /val folders
- For CycleGAN: Use TestA, TestB, TrainA, TrainB folders
For image generation, refer to the separate image generation guide.

Model Training

Navigate to the repository folder:

cd path/to/pytorch-CycleGAN-and-pix2pix

Pix2Pix Training

python train.py --dataroot ./datasets/your_dataset \
                --name your_experiment_name \
                --model pix2pix \
                --direction AtoB \
                --save_epoch_freq 1 \
                --n_epochs 500 \
                --batch_size 150

CycleGAN Training

python train.py --dataroot ./datasets/your_dataset \
                --name your_experiment_name \
                --model cycle_gan \
                --direction AtoB \
                --n_epochs 10 \
                --batch_size 1

Model Testing

python test.py --dataroot ./datasets/your_test_dataset \
               --name your_experiment_name \
               --model [pix2pix/cycle_gan] \
               --num_test 1000

Using Batch Job Scripts

For HPC environments using SLURM, we provide example scripts:

Training Script Example (p2p-train-ma-boston-v100.sh)

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --time=8:00:00
#SBATCH --job-name=train-ma-b-p2p-v100
#SBATCH --partition=gpu
#SBATCH --gres=gpu:v100-sxm2:1
#SBATCH --cpus-per-task=4
#SBATCH --mem=32GB
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your.email@example.com

module load anaconda3/2022.05 cuda/11.8
source activate /path/to/your/conda/env

python /work/re-blocking/pytorch-CycleGAN-and-pix2pix/train.py \
    --dataroot /work/re-blocking/data/ma-boston \
    --checkpoints_dir /work/re-blocking/checkpoints \
    --name ma-boston-p2p-200-150-v100 \
    --model pix2pix \
    --direction AtoB \
    --save_epoch_freq 1 \
    --continue_train \
    --epoch_count 491 \
    --n_epochs 500 \
    --batch_size 150

Testing Script Example (p2p-test-ma-boston-v100-brooklyn.sh)

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --time=0:15:00
#SBATCH --job-name=test-ma-b-b-v100
#SBATCH --partition=gpu
#SBATCH --gres=gpu:v100-sxm2:1
#SBATCH --cpus-per-task=2
#SBATCH --mem=4GB
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your.email@example.com

module load anaconda3/2022.05 cuda/11.8
source activate /path/to/your/conda/env

python /work/re-blocking/pytorch-CycleGAN-and-pix2pix/test.py \
    --dataroot /work/re-blocking/data/ny-brooklyn \
    --checkpoints_dir /work/re-blocking/checkpoints \
    --results_dir /work/re-blocking/results \
    --name ma-boston-p2p-200-150-v100 \
    --model pix2pix \
    --num_test 1000

To use these scripts:

Save them in your project directory
Make them executable: chmod +x script_name.sh
Submit the job: sbatch script_name.sh

Output and Monitoring

View training progress: ./checkpoints/your_experiment_name/web/index.html
Monitor training logs: Use logs-visualised.ipynb (Work in Progress)
Results are saved in the results directory

GPU Monitoring

Monitor NVIDIA GPU usage:

watch -n 0.1 nvidia-smi

Troubleshooting

CUDA errors: Ensure CUDA and PyTorch versions are compatible
Memory issues: Reduce batch size or image size
For other issues, consult the official PyTorch and project documentation

Additional Resources

CycleGAN and Pix2Pix Repository
PyTorch Documentation
NVIDIA CUDA Documentation

For detailed parameter explanations, refer to the options directory in the project repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Linux_README.md

Linux_README.md

CycleGAN and Pix2Pix on Linux: Setup, Training, and Testing Guide

Table of Contents

Prerequisites

Environment Setup

Data Preparation

Model Training

Pix2Pix Training

CycleGAN Training

Model Testing

Using Batch Job Scripts

Training Script Example (p2p-train-ma-boston-v100.sh)

Testing Script Example (p2p-test-ma-boston-v100-brooklyn.sh)

Output and Monitoring

GPU Monitoring

Troubleshooting

Additional Resources

Files

Linux_README.md

Latest commit

History

Linux_README.md

File metadata and controls

CycleGAN and Pix2Pix on Linux: Setup, Training, and Testing Guide

Table of Contents

Prerequisites

Environment Setup

Data Preparation

Model Training

Pix2Pix Training

CycleGAN Training

Model Testing

Using Batch Job Scripts

Training Script Example (p2p-train-ma-boston-v100.sh)

Testing Script Example (p2p-test-ma-boston-v100-brooklyn.sh)

Output and Monitoring

GPU Monitoring

Troubleshooting

Additional Resources