ThereminQ orchestrates a suite of best-of-class tools designed to control, extend and visualize data emanating to and from Quantum circuits using Qrack, Tipsy and Jupyter on CUDA and OpenCL accelerators.
- Qrack - Qbit OpenCL Hardware Emulation Stack
- Bonsai - Stellar data visualizer for QFT, Sycamore, TNN_d and SDRP validation
- Qimcifa - Quantum-inspired Monte Carlo integer factoring algorithm
Look also at the following Python enabled images
Other tags contain
docker run --gpus all \
--privileged \
-p 6080:6080 \
--device=/dev/dri:/dev/dri \
-d twobombs/thereminq[:tag]
Images can be run independant but are also made to work with the vQbit infrastructure K8s HELM repo
Installation setup and usage scenarios can be glanced at here
Build on deploy-nvidia-docker and CUDA-CLuster
- WebVNC, CUDA 12+ & OpenCL 1.2+ with NV, AMD & Intel HW support
Initial vnc password is 00000000
- noVNC website is avaliable at port
6080
- xRDP running at port
3389
to vnc127.0.0.1:5900
docker run --gpus all \
--privileged \
-p 6080:6080 \
--device=/dev/dri:/dev/dri \
--mount type=bind,source=/var/log/qrack,target=/var/log/qrack \
-d twobombs/thereminq:controller
docker run -ti \
--mount type=bind,source=/var/log/qrack,target=/var/log/qrack \
twobombs/thereminq bash /root/run-supreme-cpu
docker run --gpus all \
--device=/dev/dri:/dev/dri \
-ti \
--mount type=bind,source=/var/log/qrack,target=/var/log/qrack twobombs/thereminq bash /root/run-cosmos-gpu1
- use
--gpus all
for NVidia-Docker hosts, in addition--privileged
and--device=/dev/dri:/dev/dri
will expose all GPUs in the system, eg: AMD/Intel iGPUs
All specialized workloads are listed here
docker run --gpus all \
--device=/dev/dri:/dev/dri \
-ti \
--mount type=bind,source=/var/log/qrack,target=/var/log/qrack twobombs/thereminq bash /root/run-tnn-gpu1
Quantum Inspired Qimcifa high qbit workloads - Qrackmin
docker run --gpus all \
--device=/dev/dri:/dev/dri \
-d twobombs/thereminq:qimcifa
- Workloads with full entanglement and/or Quantum simulations that are at or exceed 30+ Qubits
- Mixed workloads based on longer/larger/deeper circuits with partial entanglement that exceed 36+ Qubits
To prevent these workload from taking up all resources of the system it is good to take the following measures
- System memory should be at least the amount of RAM where the statevector will fit into
- From 30 qubits ( eg: QFT ) count 8GB RAM Memory (and 4 cores with
POCL
) doubling every additional qubit - Start an instance with a limit for memory and/or swap. eg docker:
-m 16g --memory-swap 32g
- Disable OOM killers in the kernel and/or the container orchestrator (
--oom-kill-disable
) - OOM host change: add
vm.overcommit_memory = 1
andvm.oom-kill = 0
in/etc/sysctl.conf
- Swap should be a dedicated and fast drive where possible NVMe RAID, random IO equal to the bandwith of the GPU/PCIe
Sycamore, QFT & T_NN(-d) Results on an AMD Threadripper 1920X@4.2Ghz
-
24 Threads with 32GB RAM, 2.5TB NVMe Swap on a 11x RAID NVME drive - Tesla K80 24GB - Tesla M40 24GB results
-
M40 + K80 run script
Note: it is wise to run the benchmarks program inside a main memory limited container with outflow to fast swap so that the system remains stable at intensive runs and high memory peaks - with fast swap we mean FAST NVME RAID0 swap
note: create the underlying directory structure as mentioned in the VCL readme of Qrackmin
eg:
docker run \
--gpus all \
--device=/dev/dri:/dev/dri \
-d \
--mount type=bind,source=/var/log/vcl,target=/var/log/vcl \
--mount type=bind,source=/var/log/qrack,target=/var/log/qrack twobombs/thereminq:vcl-controller
Questions / Remarks / Ideas / Experiences - In Discord
All rights and kudos belong to their respective owners.
If (your) code resides in this container image and you don't want that please let me know.