-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NPUW] Serialization fixes #28442
[NPUW] Serialization fixes #28442
Conversation
…into as/npuw_s11n_fixes
…into as/npuw_s11n_fixes
} else if (var.is<int>()) { | ||
write(stream, int(AnyType::INT)); | ||
write(stream, var.as<int>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this may be not portable, but let's keep it aside as we don't claim any blob portability at the moment.
} else if (type == AnyType::CHARS) { | ||
std::string val; | ||
read(stream, val); | ||
var = std::move(val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but didn't CHARS
stand for const char*
in Any
when you serialized it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I serialize it as std::string to keep it simple
if (iter_device != device_bank.storage.end()) { | ||
// Already allocated | ||
// Shouldn't be possible | ||
NPUW_ASSERT(false); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a change in the contract here, what was OK previously is now not OK, can you please explain more here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It occurs during bank deserialization. Our storage is an unordered_map with unique UIDs which all have been evaluated and serialized at this point. Thus it shouldn't be possible to encounter a duplicate UIDs during deserialization. Just a sanity check - this if can be removed altogether with no effect
@@ -237,6 +238,10 @@ void Bank::read_and_add_tensor(std::istream& stream, int64_t uid, const std::str | |||
ov::Tensor allocated_tensor; | |||
|
|||
// FIXME: reading not via a dedicated function | |||
bool is_intialized = false; | |||
read(stream, is_intialized); | |||
NPUW_ASSERT(is_intialized); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this too. Haven't you wrote empty tensors before?
if is_initialized
must always be true
, why write it to the stream at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed Tensor serialization in some of the previous PRs. It occurred when I was serializing empty scales/zerops. This change is just to align with the previous change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get it. Do you serialize empty tensors now or not?
If you don't, you don't need this bool
in the stream either.
If you do, this code breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do but not for the bank
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this code doesn't break
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did I read this right you have a common routine to write a tensor to disk but you read it here in some specific way?
it doesn't sound good tbh.
you'd need to use the same function to read tensors as you write those. If that routine checks for false
and does nothing, but you still need to raise an assert, you can make it in the caller code (= HERE) if the resulting tensor is still empty after read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed on call, this change is caused by optimization (reading from stream directly to the L0 buffer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed locally
if (properties.count(ov::cache_dir.name())) { | ||
OPENVINO_THROW("Option 'CACHE_DIR' is not supported with NPU_USE_NPUW!"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we do have the necessary tools in place now, let's plan it for 25.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. This PR is just a support for LLMCompiledModel for now. We will figure out the rest later
// CACHE_DIR isn't supported with NPU_USE_NPUW | ||
if (properties.count(ov::cache_dir.name())) { | ||
OPENVINO_THROW( | ||
"Option 'CACHE_DIR' is not supported with configuration: NPU_USE_NPUW : YES, NPUW_LLM : NO!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't be surprised if people who get this will start putting "NO!" to the config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -237,6 +238,10 @@ void Bank::read_and_add_tensor(std::istream& stream, int64_t uid, const std::str | |||
ov::Tensor allocated_tensor; | |||
|
|||
// FIXME: reading not via a dedicated function | |||
bool is_intialized = false; | |||
read(stream, is_intialized); | |||
NPUW_ASSERT(is_intialized); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did I read this right you have a common routine to write a tensor to disk but you read it here in some specific way?
it doesn't sound good tbh.
you'd need to use the same function to read tensors as you write those. If that routine checks for false
and does nothing, but you still need to raise an assert, you can make it in the caller code (= HERE) if the resulting tensor is still empty after read.
No description provided.