Skip to content

Latest commit



197 lines (154 loc) · 5.39 KB

File metadata and controls

197 lines (154 loc) · 5.39 KB

CycleGAN and Pix2Pix on Linux: Setup, Training, and Testing Guide

This comprehensive guide outlines the process of setting up, training, and testing CycleGAN and Pix2Pix models on Linux systems, including usage in high-performance computing (HPC) environments.

Table of Contents

  1. Prerequisites
  2. Environment Setup
  3. Data Preparation
  4. Model Training
  5. Model Testing
  6. Using Batch Job Scripts
  7. Output and Monitoring
  8. GPU Monitoring
  9. Troubleshooting
  10. Additional Resources


  • Linux operating system (Ubuntu 18.04 or later recommended)
  • NVIDIA GPU with CUDA support
  • Internet connection
  • Sudo privileges

Environment Setup

  1. Install CUDA and cuDNN:

  2. Install Anaconda:

    source ~/.bashrc
  3. Create and activate a virtual environment:

    conda create -n pix2pix python=3.8
    conda activate pix2pix
  4. Install PyTorch:

    conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
  5. Clone the repository:

    git clone
    cd pytorch-CycleGAN-and-pix2pix
  6. Install dependencies:

    pip install -r requirements.txt

Data Preparation

  1. Organize your dataset in datasets/your_dataset_name:

    • For Pix2Pix: Use /test, /train, and /val folders
    • For CycleGAN: Use TestA, TestB, TrainA, TrainB folders
  2. For image generation, refer to the separate image generation guide.

Model Training

Navigate to the repository folder:

cd path/to/pytorch-CycleGAN-and-pix2pix

Pix2Pix Training

python --dataroot ./datasets/your_dataset \
                --name your_experiment_name \
                --model pix2pix \
                --direction AtoB \
                --save_epoch_freq 1 \
                --n_epochs 500 \
                --batch_size 150

CycleGAN Training

python --dataroot ./datasets/your_dataset \
                --name your_experiment_name \
                --model cycle_gan \
                --direction AtoB \
                --n_epochs 10 \
                --batch_size 1

Model Testing

python --dataroot ./datasets/your_test_dataset \
               --name your_experiment_name \
               --model [pix2pix/cycle_gan] \
               --num_test 1000

Using Batch Job Scripts

For HPC environments using SLURM, we provide example scripts:

Training Script Example (

#SBATCH --nodes=1
#SBATCH --time=8:00:00
#SBATCH --job-name=train-ma-b-p2p-v100
#SBATCH --partition=gpu
#SBATCH --gres=gpu:v100-sxm2:1
#SBATCH --cpus-per-task=4
#SBATCH --mem=32GB
#SBATCH --mail-type=ALL

module load anaconda3/2022.05 cuda/11.8
source activate /path/to/your/conda/env

python /work/re-blocking/pytorch-CycleGAN-and-pix2pix/ \
    --dataroot /work/re-blocking/data/ma-boston \
    --checkpoints_dir /work/re-blocking/checkpoints \
    --name ma-boston-p2p-200-150-v100 \
    --model pix2pix \
    --direction AtoB \
    --save_epoch_freq 1 \
    --continue_train \
    --epoch_count 491 \
    --n_epochs 500 \
    --batch_size 150

Testing Script Example (

#SBATCH --nodes=1
#SBATCH --time=0:15:00
#SBATCH --job-name=test-ma-b-b-v100
#SBATCH --partition=gpu
#SBATCH --gres=gpu:v100-sxm2:1
#SBATCH --cpus-per-task=2
#SBATCH --mem=4GB
#SBATCH --mail-type=ALL

module load anaconda3/2022.05 cuda/11.8
source activate /path/to/your/conda/env

python /work/re-blocking/pytorch-CycleGAN-and-pix2pix/ \
    --dataroot /work/re-blocking/data/ny-brooklyn \
    --checkpoints_dir /work/re-blocking/checkpoints \
    --results_dir /work/re-blocking/results \
    --name ma-boston-p2p-200-150-v100 \
    --model pix2pix \
    --num_test 1000

To use these scripts:

  1. Save them in your project directory
  2. Make them executable: chmod +x
  3. Submit the job: sbatch

Output and Monitoring

  1. View training progress: ./checkpoints/your_experiment_name/web/index.html
  2. Monitor training logs: Use logs-visualised.ipynb (Work in Progress)
  3. Results are saved in the results directory

GPU Monitoring

Monitor NVIDIA GPU usage:

watch -n 0.1 nvidia-smi


  • CUDA errors: Ensure CUDA and PyTorch versions are compatible
  • Memory issues: Reduce batch size or image size
  • For other issues, consult the official PyTorch and project documentation

Additional Resources

For detailed parameter explanations, refer to the options directory in the project repository.