Skip to content

Questions on chap05 10-llm-training-speed Multi gpu script #569

Answered by rasbt
STEVENTAN100 asked this question in Q&A
Discussion options

You must be logged in to vote

The code appears in the appendix-A/DDP-script, but it doesn't in https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/10_llm-training-speed/02_opt_multi_gpu_dpp.py, and I think that's the reason why the multi-gpu didn't work.

Oh I see. Thanks for clarifying. Yes, that's correct. I only focused on the torchrun code here to keep the code differences more minimal. Since most people use torchrun, and it is also the "PyTorch officially recommended way" I was planning to only recommend that as well moving forward. (Also, I didn't want to mix & match mp.spawn code for people who use torchrun, and I think it's just easier to let torchrun handle it).

The README (https://github.com/rasbt/LLM…

Replies: 1 comment 10 replies

Comment options

You must be logged in to vote
10 replies
@STEVENTAN100
Comment options

@d-kleine
Comment options

@d-kleine
Comment options

@rasbt
Comment options

Answer selected by STEVENTAN100
@STEVENTAN100
Comment options

@d-kleine
Comment options

@STEVENTAN100
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants