-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LPT] Quantized LSTMSequence & GRUSequence extended support #25654
[LPT] Quantized LSTMSequence & GRUSequence extended support #25654
Conversation
717d7bd
to
e7840ff
Compare
13709dc
to
9708278
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove debug info (serialization, couts) and fix the CI
4487204
to
715e207
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still concerned by the current implementation. We can merge it, but I think we need to prioritize CVS-147588 to remove this WA as soon as possible
|
||
BroadcastTransformation::BroadcastTransformation(const Params& params) : TransparentBaseTransformation(params) { | ||
MATCHER_SCOPE(BroadcastTransformation); | ||
auto matcher = pattern::wrap_type<ov::opset1::Broadcast>({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still don't match on v3::Broadcast here. Maybe we can match on BroadcastBase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need specify supported operation here, not base types, which can be inheritted by not supported operation in future.
src/common/low_precision_transformations/src/markup_precisions.cpp
Outdated
Show resolved
Hide resolved
...functional/plugin/shared/src/low_precision_transformations/recurrent_cell_transformation.cpp
Outdated
Show resolved
Hide resolved
src/common/low_precision_transformations/src/recurrent_cell.cpp
Outdated
Show resolved
Hide resolved
src/common/low_precision_transformations/src/recurrent_cell.cpp
Outdated
Show resolved
Hide resolved
src/common/low_precision_transformations/src/recurrent_cell.cpp
Outdated
Show resolved
Hide resolved
|
||
std::vector<std::pair<size_t, element::Type>> get_quantized_inputs(std::shared_ptr<ov::Node> lstm) { | ||
return is_type<ov::opset5::LSTMSequence>(lstm) ? | ||
std::vector<std::pair<size_t, element::Type>>{ {0, element::u8}, { 1, element::u8 }, { 4, element::undefined }, { 5, element::undefined } } : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We always regulate the desired precisions on inputs with PrecisionsRestriction
which are set by plugin. In CPU, we actually have u8u8 restrictions:
PrecisionsRestriction::create<ov::opset5::LSTMSequence>({
{{0, 1}, {ov::element::u8}}
}),
PrecisionsRestriction::create<ov::opset6::GRUSequence>({
{{0, 1}, {ov::element::u8}}
}),
Why do we need this additional logic with precisions limitations here? If there is no chance to avoid it, could you please add an explanatory comment in the code on what undefined
precision means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need it here.
LPT uses operation input precision restrictions to identify output precision for FakeQuantize
dequantization transformation. But it doesn't mean, that we can ignore filter or input validation in each transformation pattern matching, because in Constant branch (for example, a branch with single Constant
operation) LPT can not provide require precision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a valid point, but it seems like we have such checks only in this transformation. Does it mean that we must perform the checks in each transformation (or at least in the transformations which work with layers with weights)?
src/common/low_precision_transformations/include/low_precision/broadcast.hpp
Show resolved
Hide resolved
a29d204
to
dd6f642
Compare
Details:
Tickets: