-
Notifications
You must be signed in to change notification settings - Fork 649
[Core ML] Improve asset management #13560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13560
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 0541ad9 with merge base 8aca197 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
64ecf65
to
8f19e4a
Compare
@@ -254,6 +254,18 @@ BOOL is_asset_alive(NSMapTable<NSString *, ETCoreMLAsset *> *assets_in_use_map, | |||
|
|||
return assets; | |||
} | |||
|
|||
NSURL * _Nullable move_to_directory(NSURL *url, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks similar to the pattern "[self moveURL:dstURL toUniqueURLInDirectory:self.trashDirectoryURL error:nil];"
Should we consolidate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to use move_to_directory
NSFileManager *fileManager, | ||
NSError * __autoreleasing *error) { | ||
NSURL *dstURL = [directoryURL URLByAppendingPathComponent:[NSUUID UUID].UUIDString]; | ||
if (![fileManager moveItemAtURL:url toURL:dstURL error:error]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to check if url is nil when using moveItemAtURL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added checks.
|
||
// Remove any existing contents in the staging directory by moving them to the trash directory, | ||
// then recreate a clean staging directory for new use. | ||
move_to_directory([assetsDirectoryURL URLByAppendingPathComponent:@"staging"], managedTrashDirectoryURL, fileManager, nil); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if we have two models?
In model2, the staging directory is being used during compilation, but then in model1 we execute this line and move the staging directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ETCoreMLAssetManager is only created once when the backend is loaded, and this happens atinit
time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ETCoreMLAssetManager is created once for a single PTE file on delegate init, but what if there are multiple PTE files?
Do they all share the same staging directory? And can that lead to issues when one tries to delete the staging directory when another is compiling from it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No ETCoreMLAssetManager
is only created once - https://github.com/pytorch/executorch/blob/main/backends/apple/coreml/runtime/delegate/backend_delegate.mm#L138
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ETCoreMLAssetManager
and ETCoreMLModelManager
are only created once. You can put breakpoints to verify it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if I load two ET programs on two separate processes? Won't each process create ETCoreMLModelManager?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a persistent identifier, but I haven’t identified a suitable one yet. For now, I have removed the deletion step. Although we clean the directories in the staging area, if the app is terminated before the cleanup occurs, stale directories may remain. However, since these directories are located in the caches directory, the OS will automatically clean them up when storage is low.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I guess we could clean up things in staging that have last touched time stamps > 1 day during asset manager creation.
|
||
|
||
// Always clean the trash directory to ensure a minimal footprint. | ||
// The `trashQueue` is serialized, so only one cleanup will run at a time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if something is moving to the trashDirectory when it's being deleted by another thread?
For example, withTemporaryDirectory moves staging files to the trashDirectory after compilation, but then another model triggers cleanupTrashDirectory on another thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be fine trashQueue
is serial and we get the list of directories to delete at the start of the method
NSDirectoryEnumerator *enumerator = [fileManager enumeratorAtURL:self.trashDirectoryURL
includingPropertiesForKeys:@[]
options:NSDirectoryEnumerationSkipsSubdirectoryDescendants
errorHandler:errorHandler];
} | ||
|
||
// If store failed then we will load the model from compiledURL. | ||
auto backingAsset = Asset::make(compiledModelURL, identifier, assetManager.fileManager, error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we removing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If store fails then it's an error there is no point recovering from it.
|
||
// If modelURL is not provided, write model files to the destination directory (dstURL) | ||
// and obtain a URL pointing to them. Otherwise, use the provided modelURL. | ||
modelURL = (modelURL == nil) ? ::write_model_files(dstURL, self.fileManager, identifier, modelAssetType.value(), inMemoryFS, error) : modelURL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would modelURL be non-null? Is it mainly the case where the model is already compiled in the PTE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of Analyzer we also save the raw model and we don't want to write the model files again if it already exists.
return compiledModelURL; | ||
} | ||
} | ||
} | ||
|
||
#if ET_EVENT_TRACER_ENABLED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are we removing the compile flag here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, moving it.
inMemoryFS:(const inmemoryfs::InMemoryFileSystem*)inMemoryFS | ||
error:(NSError * __autoreleasing *)error { | ||
NSString *identifier = @(metadata.identifier.c_str()); | ||
NSString *rawIdentifier = raw_model_identifier(identifier); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does raw here mean? Is it to distinguish from the compiled asset (which gets the unqualified identified as its name)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
@metascroy has imported this pull request. If you are a Meta employee, you can view this in D80715432. |
@metascroy has imported this pull request. If you are a Meta employee, you can view this in D80715432. |
Overall changes look great @cymbalrush Running internal/external CI. It does look like the linter is failing. Can you run lint? From ET repo:
And then run:
|
@metascroy has imported this pull request. If you are a Meta employee, you can view this in D80715432. |
Attemping land from phabricator |
Fix asset management issues in CoreML delegate