Yuan Shen1,
Wei-Chiu Ma2,
Shenlong Wang1
Unversity of Illinois at Urbana-Champaign1, Massachusetts Institute of Technology2
Paper link │ Project Page │ Colab Quickstart
SGAM in 2024 with latest pre-trained 2D priors:
The GIF animation below is generated via SGAM with only the first RGB-D frame known in 2022:
We present a new 3D scene generation framework that simultaneously generates sensor data at novel viewpoints and builds a 3D map. Our framework is illustrated in the diagram below.
Try our Colab codebook to play our trained models on CLEVR-Infinite and GoogleEarth-Infinite!
-
Manual Installment (Only tested on Ubuntu 18.04):
- Create Conda environment and install part of python packages
conda create -n sgam python=3.9.13 conda activate sgam pip install -r requirement.txt
- Install pytorch and pytorch_lightning
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 pip install pytorch_lightning==1.5.10
- Create Conda environment and install part of python packages
- Note the depth is nonlinear when we render from blender. Checkout how we convert the depth from nonlinear to linear in line 103 of data/clevr-infinite.py
- To get a quick glance at our dataset, here is one tiny scene example.
- Two validation scene data can be downloaded from this link.
- Our training, validation and testing dataset is available at this link for downloading.
- To generate more training dataset at a large scale, we provide the blender script in clevr_generation directory. We randomly distribute primitive 3D objects by simulating flying objects falling and collision.
Detailed steps are as follows:
- find a device that has GPU, and then install blender 2.92
sudo snap install blender --channel=2.92/stable --classic
- (optional) If you want to visualize one CLEVR-Infinite scene, run the following command.
/snap/bin/blender random_scene.blend
-
Specify output directory in line 253
-
Run the following command to render. You can change the iteration number to set the number of random scene.
bash blender_generation.sh
- Run the postprocessing script to get rgb, depth map and transform.json
python convert_exr.py
- Train, val and test set are available at this google drive link.
We provide our trained model on GoogleEarth-Infinite and CLEVR-Infinite. Please download and organize the pre-trained checkpoints as follows:
SGAM
└───trained_models
└───google_earth
│ │ config.yaml
│ │ XXX.ckpt
│
└───clevr-infinite
│ config.yaml
│ XXX.ckpt
- VQGAN codebook training.
python train_generative_sensing_model.py --base configs/codebooks/XXX.yaml --gpus 0, -t True
- Conditional Generation
python train_generative_sensing_model.py --base configs/conditional_generation/XXX.yaml --gpus 0, -t True
python main_scene_generation.py --dataset="clevr-infinite" --use_rgbd_integration True
python main_scene_generation.py --dataset="google_earth" --use_rgbd_integration True
We thank Vlas Zyrianov for his feedback on our paper drafts. Besides, our codebase is modified on top of VQGAN codebase. Many thanks to Patrick Esser and Robin Rombach, who makes their code available.
If you find our work is useful, please cite our work with the bibtex down below, thanks!
@inproceedings{
shen2022sgam,
title={{SGAM}: Building a Virtual 3D World through Simultaneous Generation and Mapping},
author={Yuan Shen and Wei-Chiu Ma and Shenlong Wang},
booktitle={Thirty-Sixth Conference on Neural Information Processing Systems},
year={2022},
url={https://openreview.net/forum?id=17KCLTbRymw}
}