feat: Enhance capability of engine caching and refitting #3789

zewenli98 · 2025-08-20T20:04:39Z

Description

TensorRT 10.14 will add an argument trt.SerializationFlag.INCLUDE_REFIT to allow refitted engines to keep refittable. Based on the capability, this PR enhances the existing engine caching and refitting features as follows:

To save hard disk space, engine caching will only save weight-stripped engines on disk regardless of compilation_settings.strip_engine_weights. Then, when users pull out the cached engine, it will be automatically refitted and kept refittable.
Compiled TRT modules can be refitted multiple times with refit_module_weights(). e.g.:

for _ in range(3):
    trt_gm = refit_module_weights(trt_gm, exp_program)

Type of change

New feature (non-breaking change which adds functionality)

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

zewenli98 · 2025-08-20T20:43:30Z

TODO:

Consider if turning on engine caching by default.
Consider which arguments should be put into _SETTINGS_TO_BE_ENGINE_INVARIANT

zewenli98 added 5 commits August 19, 2025 19:37

feature support

8867b74

update weight stripped engine tests

88ffb24

fix bug

9ff8c16

remove restriction

08012a7

fix typo

f0c9c7e

zewenli98 self-assigned this Aug 20, 2025

meta-cla bot added the cla signed label Aug 20, 2025

zewenli98 marked this pull request as draft August 20, 2025 20:04

github-actions bot added component: tests Issues re: Tests component: conversion Issues re: Conversion stage component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: torch_compile labels Aug 20, 2025

github-actions bot requested a review from narendasan August 20, 2025 20:05

add todo

65b76cf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Enhance capability of engine caching and refitting #3789

feat: Enhance capability of engine caching and refitting #3789

Uh oh!

zewenli98 commented Aug 20, 2025

Uh oh!

zewenli98 commented Aug 20, 2025

Uh oh!

Uh oh!

feat: Enhance capability of engine caching and refitting #3789

Are you sure you want to change the base?

feat: Enhance capability of engine caching and refitting #3789

Uh oh!

Conversation

zewenli98 commented Aug 20, 2025

Description

Type of change

Checklist:

Uh oh!

zewenli98 commented Aug 20, 2025

Uh oh!

Uh oh!