Skip to content

Commit 970a596

Browse files
Merge branch 'master' into uk/changing-sub-byte-i4-element-order
2 parents eec1e1f + 9432b3d commit 970a596

File tree

20 files changed

+400
-49
lines changed

20 files changed

+400
-49
lines changed

.github/workflows/coverage.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ jobs:
2525

2626

2727
- name: Setup ccache
28-
uses: hendrikmuhs/ccache-action@c92f40bee50034e84c763e33b317c77adaa81c92 # v1.2.13
28+
uses: hendrikmuhs/ccache-action@ed74d11c0b343532753ecead8a951bb09bb34bc9 # v1.2.14
2929
with:
3030
max-size: 50G
3131

.github/workflows/mac.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,7 @@ jobs:
131131
#
132132

133133
- name: Setup ccache
134-
uses: hendrikmuhs/ccache-action@c92f40bee50034e84c763e33b317c77adaa81c92 # v1.2.13
134+
uses: hendrikmuhs/ccache-action@ed74d11c0b343532753ecead8a951bb09bb34bc9 # v1.2.14
135135
with:
136136
max-size: "2000M"
137137
# Should save cache only if run in the master branch of the base repo

.github/workflows/mac_arm64.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,7 @@ jobs:
131131
#
132132

133133
- name: Setup ccache
134-
uses: hendrikmuhs/ccache-action@c92f40bee50034e84c763e33b317c77adaa81c92 # v1.2.13
134+
uses: hendrikmuhs/ccache-action@ed74d11c0b343532753ecead8a951bb09bb34bc9 # v1.2.14
135135
with:
136136
max-size: "2000M"
137137
# Should save cache only if run in the master branch of the base repo

docs/articles_en/get-started/configurations.rst

+9-6
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,4 @@
1-
.. {#openvino_docs_install_guides_configurations_header}
2-
3-
Additional Configurations For Hardware
1+
Additional Configurations
42
======================================
53

64

@@ -16,10 +14,10 @@ Additional Configurations For Hardware
1614

1715
For GPU <configurations/configurations-intel-gpu>
1816
For NPU <configurations/configurations-intel-npu>
17+
GenAI Dependencies <configurations/genai-dependencies>
1918

20-
For certain use cases, you may need to install additional software, to use the full
21-
potential of OpenVINO™. Check the following list for components for elements used in
22-
your workflow:
19+
For certain use cases, you may need to install additional software, to benefit from the full
20+
potential of OpenVINO™. Check the following list for components used in your workflow:
2321

2422
| **GPU drivers**
2523
| If you want to run inference on a GPU, make sure your GPU's drivers are properly installed.
@@ -33,6 +31,11 @@ your workflow:
3331
See the :doc:`guide on NPU configuration <configurations/configurations-intel-npu>`
3432
for details.
3533
34+
| **OpenVINO GenAI Dependencies**
35+
| OpenVINO GenAI is a flavor of OpenVINO, aiming to simplify running generative
36+
AI models. For information on the dependencies required to use OpenVINO GenAI, see the
37+
:doc:`guide on OpenVINO GenAI Dependencies <configurations/genai-dependencies>`.
38+
3639
| **Open Computer Vision Library**
3740
| OpenCV is used to extend the capabilities of some models, for example enhance some of
3841
OpenVINO samples, when used as a dependency in compilation. To install OpenCV for OpenVINO, see the
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
OpenVINO™ GenAI Dependencies
2+
=================================
3+
4+
OpenVINO™ GenAI depends on both `OpenVINO <https://github.com/openvinotoolkit/openvino>`__ and
5+
`OpenVINO Tokenizers <https://github.com/openvinotoolkit/openvino_tokenizers>`__. During OpenVINO™
6+
GenAI installation from PyPi, the same versions of OpenVINO and OpenVINO Tokenizers
7+
are used (e.g. ``openvino==2024.3.0`` and ``openvino-tokenizers==2024.3.0.0`` are installed for
8+
``openvino-genai==2024.3.0``).
9+
10+
Trying to update any of the dependency packages might result in a version incompatiblibty
11+
due to different Application Binary Interfaces (ABIs), which will result in errors while running
12+
OpenVINO GenAI. Having package version in the ``<MAJOR>.<MINOR>.<PATCH>.<REVISION>`` format, allows
13+
changing the ``<REVISION>`` portion of the full version to ensure ABI compatibility. Changing
14+
``<MAJOR>``, ``<MINOR>`` or ``<PATCH>`` part of the version may break ABI.
15+
16+
GenAI, Tokenizers, and OpenVINO wheels for Linux on PyPI are compiled with ``_GLIBCXX_USE_CXX11_ABI=0``
17+
to cover a wider range of platforms. In the C++ archive distributions for Ubuntu, ``_GLIBCXX_USE_CXX11_ABI=1``
18+
is used instead. Mixing different ABIs is not possible as doing so will result in a link error.
19+
20+
To try OpenVINO GenAI with different dependencies versions (which are **not** prebuilt packages
21+
as archives or python wheels), build OpenVINO GenAI library from
22+
`Source <https://github.com/openvinotoolkit/openvino.genai/blob/releases/2024/3/src/docs/BUILD.md#build-openvino-openvino-tokenizers-and-openvino-genai-from-source>`__.
23+
24+
Additional Resources
25+
#######################
26+
27+
* :doc:`OpenVINO GenAI Installation Guide <../install-openvino/install-openvino-genai>`
28+
* `OpenVINO GenAI repository <https://github.com/openvinotoolkit/openvino.genai>`__
29+
* :doc:`OpenVINO Installation Guide <../install-openvino>`
30+
* :doc:`OpenVINO Tokenizers <../../learn-openvino/llm_inference_guide/ov-tokenizers>`
31+

docs/articles_en/get-started/install-openvino/install-openvino-genai.rst

+3-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,9 @@ To see GenAI in action, check the Jupyter notebooks:
1111
`LLM-powered Chatbot <https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-chatbot/README.md>`__ and
1212
`LLM Instruction-following pipeline <https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-question-answering/README.md>`__.
1313

14-
The OpenVINO GenAI flavor is available for installation via PyPI and Archive distributions:
14+
The OpenVINO GenAI flavor is available for installation via PyPI and Archive distributions.
15+
A `detailed guide <https://github.com/openvinotoolkit/openvino.genai/blob/releases/2024/3/src/docs/BUILD.md>`__
16+
on how to build OpenVINO GenAI is available in the OpenVINO GenAI repository.
1517

1618
PyPI Installation
1719
###############################

docs/dev/build_linux.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ You can use the following additional build options:
7474
```
7575
3. After the build process finishes, export the newly built Python libraries to the user environment variables:
7676
```
77-
export PYTHONPATH=<openvino_repo>/bin/intel64/Release/python:$PYTHONPATH
77+
export PYTHONPATH=<openvino_repo>/bin/intel64/Release/python:<openvino_repo>/tools/ovc:$PYTHONPATH
7878
export LD_LIBRARY_PATH=<openvino_repo>/bin/intel64/Release:$LD_LIBRARY_PATH
7979
```
8080
or install the wheel with pip:

docs/dev/build_windows.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ Supported configurations:
6161
```
6262
3. After the build process finishes, export the newly built Python libraries to the user environment variables:
6363
```
64-
set PYTHONPATH=<openvino_repo>/bin/<arch>/Release/python;%PYTHONPATH%
64+
set PYTHONPATH=<openvino_repo>/bin/<arch>/Release/python;<openvino_repo>/tools/ovc;%PYTHONPATH%
6565
set OPENVINO_LIB_PATHS=<openvino_repo>/bin/<arch>/Release;<openvino_repo>/temp/tbb/bin
6666
```
6767
or install the wheel with pip:

src/common/transformations/include/transformations/common_optimizations/move_eltwise_up_data_movement.hpp

+48-2
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,56 @@
1111
namespace ov {
1212
namespace pass {
1313

14-
class TRANSFORMATIONS_API MoveEltwiseUpThroughDataMov : public ov::pass::MatcherPass {
14+
/// This transformation tries to put element-wise operations (Unary or Binary with scalar second input) before a set of
15+
/// data movement ops in order to allow further element-wise op fusion to previous op and zero-copy optimizations for
16+
/// data movement op itself.
17+
/// ┌───────────┐ ┌───────────┐
18+
/// │ AnyOp │ │ AnyOp │
19+
/// └─────┬─────┘ └─────┬─────┘
20+
/// │ │
21+
/// │ │
22+
/// ┌───────┴────────┐ ┌───────┴────────┐
23+
/// | DataMovementOp | => | Element-Wise |
24+
/// └───────┬────────┘ └───────┬────────┘
25+
/// │ │
26+
/// │ │
27+
/// ┌───────┴────────┐ ┌───────┴────────┐
28+
/// │ Element-Wise | │ DataMovementOp |
29+
/// └────────────────┘ └────────────────┘
30+
class TRANSFORMATIONS_API MoveEltwiseUpThroughDataMovScalar : public ov::pass::MatcherPass {
31+
public:
32+
OPENVINO_RTTI("MoveEltwiseUpThroughDataMovScalar", "0");
33+
MoveEltwiseUpThroughDataMovScalar(std::vector<DiscreteTypeInfo> allowed_data_movement_ops);
34+
};
35+
36+
/// This transformation tries to put element-wise operations before Reshape/Squeeze/Unsqueeze ops
37+
/// when second input to eltwise is per-channel Constant op
38+
/// ┌───────────┐ ┌────────────────┐ ┌───────────┐ ┌────────────────────┐
39+
/// │ AnyOp │ │ TargetShape │ │ AnyOp │ │ Per-Channel Const │
40+
/// └─────┬─────┘ └────────┬───────┘ └─────┬─────┘ └─────────┬──────────┘
41+
/// │ │ │ │
42+
/// │ | │ |
43+
/// │ ┌─────────┐ │ │ ┌──────────────┐ │
44+
/// └───┤ Reshape ├────────┘ => └───┤ Element-Wise ├─────────┘
45+
/// └────┬────┘ └───────┬──────┘
46+
/// │ │
47+
/// │ │
48+
/// ┌───────┴────────┐ ┌────────────────────┐ ┌─────┴─────┐ ┌─────────────┐
49+
/// │ Element-Wise ├────┤ Per-Channel Const │ │ Reshape ├────┤ TargetShape │
50+
/// └────────────────┘ └────────────────────┘ └───────────┘ └─────────────┘
51+
class TRANSFORMATIONS_API MoveEltwiseUpThroughDataMovPerChannel : public ov::pass::MatcherPass {
52+
public:
53+
OPENVINO_RTTI("MoveEltwiseUpThroughDataMovPerChannel", "0");
54+
MoveEltwiseUpThroughDataMovPerChannel();
55+
};
56+
57+
class TRANSFORMATIONS_API MoveEltwiseUpThroughDataMov : public ov::pass::GraphRewrite {
1558
public:
1659
OPENVINO_RTTI("MoveEltwiseUpThroughDataMov", "0");
17-
MoveEltwiseUpThroughDataMov(std::vector<DiscreteTypeInfo> allowed_data_movement_ops = get_default_allowed_ops());
60+
MoveEltwiseUpThroughDataMov(std::vector<DiscreteTypeInfo> allowed_data_movement_ops = get_default_allowed_ops()) {
61+
this->add_matcher<MoveEltwiseUpThroughDataMovScalar>(allowed_data_movement_ops);
62+
this->add_matcher<MoveEltwiseUpThroughDataMovPerChannel>();
63+
}
1864

1965
private:
2066
static std::vector<DiscreteTypeInfo> get_default_allowed_ops();

src/common/transformations/src/transformations/common_optimizations/move_eltwise_up_data_movement.cpp

+106-3
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,17 @@
44

55
#include "transformations/common_optimizations/move_eltwise_up_data_movement.hpp"
66

7+
#include <algorithm>
78
#include <memory>
8-
#include <numeric>
99
#include <openvino/opsets/opset8.hpp>
1010

1111
#include "itt.hpp"
1212
#include "openvino/core/rt_info.hpp"
1313
#include "openvino/core/type.hpp"
14+
#include "openvino/op/constant.hpp"
15+
#include "openvino/op/reshape.hpp"
16+
#include "openvino/op/squeeze.hpp"
17+
#include "openvino/op/unsqueeze.hpp"
1418
#include "openvino/pass/pattern/op/wrap_type.hpp"
1519
#include "transformations/utils/utils.hpp"
1620

@@ -50,9 +54,9 @@ std::vector<ov::DiscreteTypeInfo> ov::pass::MoveEltwiseUpThroughDataMov::get_def
5054
};
5155
}
5256

53-
ov::pass::MoveEltwiseUpThroughDataMov::MoveEltwiseUpThroughDataMov(
57+
ov::pass::MoveEltwiseUpThroughDataMovScalar::MoveEltwiseUpThroughDataMovScalar(
5458
std::vector<DiscreteTypeInfo> allowed_data_movement_ops) {
55-
MATCHER_SCOPE(MoveEltwiseUpThroughDataMov);
59+
MATCHER_SCOPE(MoveEltwiseUpThroughDataMovScalar);
5660
auto eltwise_pattern = ov::pass::pattern::wrap_type<ov::op::util::UnaryElementwiseArithmetic,
5761
ov::op::util::BinaryElementwiseArithmetic,
5862
ov::op::v0::FakeQuantize>(ov::pass::pattern::has_static_rank());
@@ -126,3 +130,102 @@ ov::pass::MoveEltwiseUpThroughDataMov::MoveEltwiseUpThroughDataMov(
126130
auto m = std::make_shared<ov::pass::pattern::Matcher>(eltwise_pattern, matcher_name);
127131
register_matcher(m, callback);
128132
}
133+
134+
ov::pass::MoveEltwiseUpThroughDataMovPerChannel::MoveEltwiseUpThroughDataMovPerChannel() {
135+
MATCHER_SCOPE(MoveEltwiseUpThroughDataMovPerChannel);
136+
137+
auto const_predicate = [](const ov::Output<ov::Node>& output) {
138+
auto constant_op = std::dynamic_pointer_cast<ov::opset8::Constant>(output.get_node_shared_ptr());
139+
if (!constant_op)
140+
return false;
141+
142+
if (output.get_target_inputs().size() != 1)
143+
return false;
144+
145+
const auto& shape = constant_op->get_shape();
146+
return std::count_if(shape.begin(), shape.end(), [](size_t v) {
147+
return v > 1;
148+
}) == 1;
149+
};
150+
151+
auto eltw_predicate = [](const ov::Output<ov::Node>& output) {
152+
if (output.get_target_inputs().size() != 1)
153+
return false;
154+
155+
auto node = output.get_node();
156+
157+
if (node->get_output_partial_shape(0).rank().is_dynamic())
158+
return false;
159+
160+
const size_t const_idx = ov::is_type<ov::op::v0::Constant>(node->get_input_node_ptr(0)) ? 0 : 1;
161+
const size_t data_flow_idx = (const_idx + 1) % 2;
162+
163+
if (node->get_input_partial_shape(data_flow_idx).size() < node->get_input_partial_shape(const_idx).size())
164+
return false;
165+
166+
return true;
167+
};
168+
169+
auto eltw_data_flow_in =
170+
ov::pass::pattern::wrap_type<ov::op::v1::Reshape, ov::op::v0::Squeeze, ov::op::v0::Unsqueeze>();
171+
auto eltw_const_in = ov::pass::pattern::wrap_type<ov::op::v0::Constant>(const_predicate);
172+
auto eltwise_pattern =
173+
ov::pass::pattern::wrap_type<ov::op::util::BinaryElementwiseArithmetic>({eltw_data_flow_in, eltw_const_in},
174+
eltw_predicate);
175+
176+
ov::matcher_pass_callback callback = [OV_CAPTURE_CPY_AND_THIS](ov::pass::pattern::Matcher& m) {
177+
const auto& pattern_map = m.get_pattern_value_map();
178+
179+
auto eltwise = pattern_map.at(eltwise_pattern).get_node_shared_ptr();
180+
if (transformation_callback(eltwise)) {
181+
return false;
182+
}
183+
184+
const size_t const_idx = ov::is_type<ov::op::v0::Constant>(eltwise->get_input_node_ptr(0)) ? 0 : 1;
185+
const size_t data_flow_idx = (const_idx + 1) % 2;
186+
187+
auto const_shape = eltwise->get_input_shape(const_idx);
188+
size_t channel_idx = 0;
189+
size_t channel_val = 0;
190+
for (size_t i = 0; i < const_shape.size(); i++) {
191+
if (const_shape[i] > 1) {
192+
channel_idx = i;
193+
channel_val = const_shape[i];
194+
}
195+
}
196+
197+
auto parent = eltwise->get_input_node_shared_ptr(data_flow_idx);
198+
const auto& parent_in_pshape = parent->get_input_partial_shape(0);
199+
auto parent_in_channel_dim =
200+
parent_in_pshape.size() <= channel_idx ? ov::Dimension(1) : parent_in_pshape[channel_idx];
201+
auto parent_out_channel_dim = parent->get_output_partial_shape(0)[channel_idx];
202+
if (parent_in_channel_dim.is_dynamic() || parent_in_channel_dim != channel_val ||
203+
parent_out_channel_dim.is_dynamic() || parent_out_channel_dim != channel_val)
204+
return false;
205+
206+
auto new_shape = ov::Shape(parent->get_input_partial_shape(0).size(), 1);
207+
208+
new_shape[channel_idx] = const_shape[channel_idx];
209+
auto old_const = std::dynamic_pointer_cast<ov::op::v0::Constant>(eltwise->get_input_node_shared_ptr(const_idx));
210+
auto new_const = std::make_shared<ov::op::v0::Constant>(*old_const, new_shape);
211+
ov::replace_node_update_name(old_const, new_const);
212+
ov::replace_output_update_name(eltwise->output(0), eltwise->input_value(data_flow_idx));
213+
214+
ov::OutputVector eltwise_inputs = eltwise->input_values();
215+
eltwise_inputs[data_flow_idx] = parent->input_value(0);
216+
auto new_eltwise = eltwise->clone_with_new_inputs(eltwise_inputs);
217+
ov::copy_runtime_info(eltwise, new_eltwise);
218+
219+
ov::OutputVector parent_inputs = parent->input_values();
220+
parent_inputs[0] = new_eltwise;
221+
auto new_parent = parent->clone_with_new_inputs(parent_inputs);
222+
ov::copy_runtime_info(parent, new_parent);
223+
new_parent->set_friendly_name(parent->get_friendly_name());
224+
225+
ov::replace_node(parent, new_parent);
226+
return true;
227+
};
228+
229+
auto m = std::make_shared<ov::pass::pattern::Matcher>(eltwise_pattern, matcher_name);
230+
register_matcher(m, callback);
231+
}

0 commit comments

Comments
 (0)