Skip to content

[PyTorch] Refactor activation offloading of quantized tensors. #1738

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

pggPL
Copy link
Collaborator

@pggPL pggPL commented Apr 30, 2025

Description

The code of activation offloading has some complex logic handling offloading Float8Tensor object - it disassembles the object into the data tensors, then offloads them separately and then assembles them.

I add empty_like(..., device=..., pin_memory=...) method to Float8Tensor, which allow easy CPU backup tensor allocation. This allows me to make the code of the offloading much simpler.

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Changes

Please list the changes introduced in this PR:

  • Change A
  • Change B

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

pggPL added 2 commits April 30, 2025 13:47
Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
@pggPL pggPL force-pushed the quantized_tensor_offloading branch from bbf24cd to 657cbbe Compare April 30, 2025 14:57
pre-commit-ci bot and others added 5 commits April 30, 2025 15:00
Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
@pggPL pggPL marked this pull request as ready for review April 30, 2025 16:43
@pggPL
Copy link
Collaborator Author

pggPL commented Apr 30, 2025

/te-ci pytorch

pggPL added 2 commits May 5, 2025 09:59
Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
@pggPL
Copy link
Collaborator Author

pggPL commented May 5, 2025

/te-ci pytorch

pre-commit-ci bot and others added 5 commits May 5, 2025 10:19
Signed-off-by: Pawel Gadzinski <pawelgadzinski@gmail.com>
Signed-off-by: Pawel Gadzinski <pawelgadzinski@gmail.com>
@pggPL
Copy link
Collaborator Author

pggPL commented May 8, 2025

/te-ci pytorch

@pggPL pggPL force-pushed the quantized_tensor_offloading branch from 32f982b to 4293d32 Compare May 8, 2025 16:47
@pggPL
Copy link
Collaborator Author

pggPL commented May 8, 2025

/te-ci pytorch

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
@pggPL pggPL force-pushed the quantized_tensor_offloading branch from 7a6b62d to da1bbf9 Compare May 9, 2025 11:56
@pggPL
Copy link
Collaborator Author

pggPL commented May 9, 2025

/te-ci pytorch

1 similar comment
@pggPL
Copy link
Collaborator Author

pggPL commented May 9, 2025

/te-ci pytorch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant