Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EM] Optimization for deep trees. #11387

Merged
merged 22 commits into from
Apr 8, 2025

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented Apr 4, 2025

  • Decouple the row partition batch size from the driver batch size. This will allow us to process more nodes for each data batch.
  • Pick a heuristic to use ATS instead of data copy to handle cases where we have a large number of small nodes.
  • Make sure a new page that happens to be the last is placed on the host.

The optimization is specifically for GPU external memory. We need some kernel fusion to improve in-core training further.

todo:

  • Perf test
  • Make sure the benchmark is deterministic.

in-core

depth/branch PR master
6 160.2554975748062 160.7703891992569
12 275.5357859134674 277.49394392967224

out-of-core

PR Master
12 2990.530539 7855.628397

@trivialfis trivialfis changed the title [WIP][EM] Optimization for deep trees. [EM] Optimization for deep trees. Apr 6, 2025
@trivialfis
Copy link
Member Author

cc @rongou .

@trivialfis trivialfis merged commit 6a1302c into dmlc:master Apr 8, 2025
58 of 59 checks passed
@trivialfis trivialfis deleted the ext-ats-small-nodes branch April 8, 2025 07:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants