Skip to content

Commit abfbdd1

Browse files
rkazantsilya-lavrenovpraasz
authored
[Core] Support String Tensors (openvinotoolkit#21244)
* [Core] Support String Tensors Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Add String Constant implementation Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Fix build issue in tests * Add cast_vector for Constant of ov::string type Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Fix build issue Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Fix build issue: ambiguous type in GNA * Fix ambiguous build issue in GNA tests Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Fix code-style * Fix code-style Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Fix ambiguous build issue in GNA tests Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Fix ambiguous build issue in TF FE tests Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Update openvino.style for naming convention check Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Fix compilation error in core unit tests - need typename Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Add test for new element_type Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Fix code-style Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Update src/inference/src/dev/make_tensor.cpp Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com> * Add support of string Tensors for Constant Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Fix copying string tensor value for Constant Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Complete template methods for Constant Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Improve performance for initialization and destruction of string Tensor for set_shape Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Add check for string value in test Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Remove unused variable Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Update src/inference/src/dev/make_tensor.cpp * Fix copy_to for ITensor of string type and add tests Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Add tests for Constant of string type and serialization Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Use memset_allocation to switch initialization Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Add additional documentation for host_ptr Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Update src/core/src/op/constant.cpp * Use OPENVINO_THROW Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Update src/core/include/openvino/op/constant.hpp * Update src/core/include/openvino/op/constant.hpp Co-authored-by: Pawel Raasz <pawel.raasz@intel.com> * Apply code-review feedback: use string_size Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Apply code-review feedback Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Recover evaluate impl for non-string type Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Fix code for creating of string constant for legacy non HostTensor tensor Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Fix build issue Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Apply code-review feedback: simplify copy_to method Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Fix build issue Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Use StringAlignedBuffer to store string Constant values Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Remove not needed methods in StringAlignedBuffer Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> * Refactor set_shape method Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> --------- Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com> Co-authored-by: Pawel Raasz <pawel.raasz@intel.com>
1 parent 7bb542f commit abfbdd1

File tree

40 files changed

+966
-115
lines changed

40 files changed

+966
-115
lines changed

cmake/developer_package/ncc_naming_style/openvino.style

+1-1
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ VariableReference: '^\w+$'
1818

1919
EnumName: '^[A-Z][\w]+$'
2020
# excepts element_type
21-
EnumConstantName: '^([A-Z\d_]+|undefined|dynamic|boolean|bf16|f16|f32|f64|i4|i8|i16|i32|i64|u1|u4|u8|u16|u32|u64|nf4|asymmetric|align_corners|round_prefer_floor|round_prefer_ceil|floor|ceil|simple|nearest|linear|linear_onnx|cubic|area|scales|sizes|half_pixel|tf_half_pixel_for_nn|pytorch_half_pixel|asymetric)$'
21+
EnumConstantName: '^([A-Z\d_]+|undefined|dynamic|boolean|bf16|f16|f32|f64|i4|i8|i16|i32|i64|u1|u4|u8|u16|u32|u64|nf4|string|asymmetric|align_corners|round_prefer_floor|round_prefer_ceil|floor|ceil|simple|nearest|linear|linear_onnx|cubic|area|scales|sizes|half_pixel|tf_half_pixel_for_nn|pytorch_half_pixel|asymetric)$'
2222
# TODO: align
2323
UsingDeclaration: '^.*$'
2424
TypedefName: '^.*$'

src/common/transformations/src/transformations/common_optimizations/gru_cell_fusion.cpp

+3-3
Original file line numberDiff line numberDiff line change
@@ -172,8 +172,8 @@ ov::pass::GRUCellFusion::GRUCellFusion() {
172172

173173
auto squeeze_B = rg.make<ov::op::v0::Squeeze>(Bzrh, axis_0);
174174

175-
string act_name_1 = pattern_map.at(activation_1)->get_type_name();
176-
string act_name_2 = pattern_map.at(activation_2)->get_type_name();
175+
std::string act_name_1 = pattern_map.at(activation_1)->get_type_name();
176+
std::string act_name_2 = pattern_map.at(activation_2)->get_type_name();
177177
auto to_lower = [](unsigned char c) {
178178
return std::tolower(c);
179179
};
@@ -186,7 +186,7 @@ ov::pass::GRUCellFusion::GRUCellFusion() {
186186
Rzrh,
187187
squeeze_B,
188188
hidden_size,
189-
vector<string>{act_name_1, act_name_2});
189+
vector<std::string>{act_name_1, act_name_2});
190190

191191
cell->set_friendly_name(m.get_match_root()->get_friendly_name());
192192
copy_runtime_info(m.get_matched_nodes(), rg.get());

src/common/transformations/tests/common_optimizations/gru_cell_fusion.cpp

+8-8
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ namespace {
2121

2222
enum class WeightsFormat { zr, rz };
2323

24-
Output<Node> create_activation_by_name(const string& activation_name, const Output<Node>& input) {
24+
Output<Node> create_activation_by_name(const std::string& activation_name, const Output<Node>& input) {
2525
if (activation_name == "sigmoid") {
2626
return make_shared<Sigmoid>(input);
2727
} else if (activation_name == "tanh") {
@@ -33,8 +33,8 @@ Output<Node> create_activation_by_name(const string& activation_name, const Outp
3333
}
3434

3535
shared_ptr<Model> gen_model(WeightsFormat format,
36-
const string& activation_1,
37-
const string& activation_2,
36+
const std::string& activation_1,
37+
const std::string& activation_2,
3838
size_t batch,
3939
size_t hidden_size,
4040
size_t input_size,
@@ -83,8 +83,8 @@ shared_ptr<Model> gen_model(WeightsFormat format,
8383
}
8484

8585
shared_ptr<Model> gen_reference(WeightsFormat format,
86-
const string& activation_1,
87-
const string& activation_2,
86+
const std::string& activation_1,
87+
const std::string& activation_2,
8888
size_t batch,
8989
size_t hidden_size,
9090
size_t input_size,
@@ -132,15 +132,15 @@ shared_ptr<Model> gen_reference(WeightsFormat format,
132132

133133
auto squeeze_B = make_shared<Squeeze>(Bzrh, axis_0);
134134
auto cell =
135-
make_shared<GRUCell>(X, H, Wzrh, Rzrh, squeeze_B, hidden_size, vector<string>{activation_1, activation_2});
135+
make_shared<GRUCell>(X, H, Wzrh, Rzrh, squeeze_B, hidden_size, vector<std::string>{activation_1, activation_2});
136136
return make_shared<Model>(OutputVector{cell}, params);
137137
}
138138
} // namespace
139139

140140
struct GRUFusionParams {
141141
WeightsFormat format;
142-
string activation_1;
143-
string activation_2;
142+
std::string activation_1;
143+
std::string activation_2;
144144
size_t batch;
145145
size_t hidden_size;
146146
size_t input_size;

src/core/builder/include/ngraph/builder/make_constant.hpp

+3
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,9 @@ std::shared_ptr<Node> make_constant(const element::Type& type, const Shape& shap
102102
case element::Type_t::nf4:
103103
unsupported_data_type = "nf4";
104104
break;
105+
case element::Type_t::string:
106+
unsupported_data_type = "string";
107+
break;
105108
case element::Type_t::undefined:
106109
unsupported_data_type = "undefined";
107110
break;
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
// Copyright (C) 2018-2023 Intel Corporation
2+
// SPDX-License-Identifier: Apache-2.0
3+
//
4+
5+
#pragma once
6+
7+
#include "openvino/runtime/aligned_buffer.hpp"
8+
9+
namespace ov {
10+
11+
/// \brief StringAlignedBuffer class to store pointer to pre-allocated buffer with std::string objects
12+
/// it is responsible for deallocation of std::string objects that will be stored in the buffer
13+
class StringAlignedBuffer : public ov::AlignedBuffer {
14+
public:
15+
StringAlignedBuffer() = default;
16+
StringAlignedBuffer(size_t num_elements, size_t byte_size, size_t alignment, bool initialize);
17+
18+
virtual ~StringAlignedBuffer();
19+
20+
private:
21+
StringAlignedBuffer(const StringAlignedBuffer&) = delete;
22+
StringAlignedBuffer& operator=(const StringAlignedBuffer&) = delete;
23+
24+
protected:
25+
size_t m_num_elements;
26+
};
27+
28+
} // namespace ov

src/core/include/ngraph/type/element_type.hpp

+1
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ using ov::element::i4;
4141
using ov::element::i64;
4242
using ov::element::i8;
4343
using ov::element::nf4;
44+
using ov::element::string;
4445
using ov::element::u1;
4546
using ov::element::u16;
4647
using ov::element::u32;

src/core/include/openvino/core/type/element_type.hpp

+7-1
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,8 @@ enum class Type_t {
5151
u16, //!< u16 element type
5252
u32, //!< u32 element type
5353
u64, //!< u64 element type
54-
nf4 //!< nf4 element type
54+
nf4, //!< nf4 element type
55+
string //!< string element type
5556
};
5657

5758
/// \brief Base class to define element type
@@ -181,6 +182,9 @@ constexpr Type u64(Type_t::u64);
181182
/// \brief nf4 element type
182183
/// \ingroup ov_element_cpp_api
183184
constexpr Type nf4(Type_t::nf4);
185+
/// \brief string element type
186+
/// \ingroup ov_element_cpp_api
187+
constexpr Type string(Type_t::string);
184188

185189
template <typename T>
186190
Type from() {
@@ -214,6 +218,8 @@ template <>
214218
OPENVINO_API Type from<ov::bfloat16>();
215219
template <>
216220
OPENVINO_API Type from<ov::float16>();
221+
template <>
222+
OPENVINO_API Type from<std::string>();
217223

218224
OPENVINO_API Type fundamental_type_for(const Type& type);
219225

src/core/include/openvino/core/type/element_type_traits.hpp

+5
Original file line numberDiff line numberDiff line change
@@ -97,4 +97,9 @@ template <>
9797
struct element_type_traits<element::Type_t::nf4> {
9898
using value_type = int8_t;
9999
};
100+
101+
template <>
102+
struct element_type_traits<element::Type_t::string> {
103+
using value_type = std::string;
104+
};
100105
} // namespace ov

src/core/include/openvino/op/constant.hpp

+84-3
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,9 @@ class OPENVINO_API Constant : public Op {
161161
case Type_t::nf4:
162162
fill_data<Type_t::nf4>(value);
163163
break;
164+
case Type_t::string:
165+
fill_data<Type_t::string>(value);
166+
break;
164167
case Type_t::undefined:
165168
case Type_t::dynamic:
166169
OPENVINO_THROW("unsupported type");
@@ -364,6 +367,9 @@ class OPENVINO_API Constant : public Op {
364367
case Type_t::u64:
365368
cast_vector<Type_t::u64>(rc, num_elements_to_cast);
366369
break;
370+
case Type_t::string:
371+
cast_vector<Type_t::string>(rc, num_elements_to_cast);
372+
break;
367373
default:
368374
OPENVINO_THROW("unsupported type");
369375
}
@@ -454,7 +460,7 @@ class OPENVINO_API Constant : public Op {
454460
template <element::Type_t Type,
455461
typename OUT_T,
456462
typename std::enable_if<Type != element::Type_t::u1 && Type != element::Type_t::u4 &&
457-
Type != element::Type_t::i4,
463+
Type != element::Type_t::i4 && Type != element::Type_t::string,
458464
bool>::type = true>
459465
void cast_vector(std::vector<OUT_T>& output_vector, size_t num_elements) const {
460466
// this function is workaround for waring during windows building
@@ -511,6 +517,29 @@ class OPENVINO_API Constant : public Op {
511517
});
512518
}
513519

520+
template <element::Type_t Type, typename std::enable_if<Type == element::Type_t::string, bool>::type = true>
521+
void cast_vector(std::vector<std::string>& output_vector, size_t num_elements) const {
522+
auto output_size = std::min(num_elements, shape_size(m_shape));
523+
output_vector.reserve(output_size);
524+
const auto p = get_data_ptr<Type>();
525+
std::copy_n(p, output_size, std::back_inserter(output_vector));
526+
}
527+
528+
template <element::Type_t Type, typename std::enable_if<Type != element::Type_t::string, bool>::type = true>
529+
void cast_vector(std::vector<std::string>& output_vector, size_t num_elements) const {
530+
OPENVINO_THROW("cast_vector does not support casting ov::Tensor of type " +
531+
ov::element::Type(Type).to_string() + "to std::vector of std::string elements");
532+
}
533+
534+
template <element::Type_t Type,
535+
typename OUT_T,
536+
typename std::enable_if<Type == element::Type_t::string, bool>::type = true>
537+
void cast_vector(std::vector<OUT_T>& output_vector, size_t num_elements) const {
538+
auto output_type = std::string(typeid(OUT_T{}).name());
539+
OPENVINO_THROW("cast_vector does not support casting string ov::Tensor to std::vector with elements of type " +
540+
output_type);
541+
}
542+
514543
template <element::Type_t Type,
515544
typename OUT_T,
516545
typename std::enable_if<Type == element::Type_t::u1, bool>::type = true>
@@ -569,11 +598,19 @@ class OPENVINO_API Constant : public Op {
569598
output.resize(element_number);
570599
}
571600

601+
template <element::Type_t Type,
602+
typename StorageDataType = fundamental_type_for<Type>,
603+
typename std::enable_if<Type != element::Type_t::string, bool>::type = true>
604+
void fill_data(const std::string& value) {
605+
OPENVINO_THROW("Called fill_data(std::string) with non-string element_type");
606+
}
607+
572608
template <element::Type_t Type,
573609
typename T,
574610
typename StorageDataType = fundamental_type_for<Type>,
575611
typename std::enable_if<Type != element::Type_t::u1 && Type != element::Type_t::u4 &&
576-
Type != element::Type_t::i4 && Type != element::Type_t::nf4,
612+
Type != element::Type_t::i4 && Type != element::Type_t::nf4 &&
613+
Type != element::Type_t::string,
577614
bool>::type = true>
578615
void fill_data(const T& value) {
579616
#ifdef __clang__
@@ -614,6 +651,21 @@ class OPENVINO_API Constant : public Op {
614651
std::fill_n(get_data_ptr_nc<Type>(), size, v);
615652
}
616653

654+
template <element::Type_t Type, typename std::enable_if<Type == element::Type_t::string, bool>::type = true>
655+
void fill_data(const std::string& value) {
656+
auto num_elements = shape_size(m_shape);
657+
std::uninitialized_fill_n(get_data_ptr_nc<Type>(), num_elements, value);
658+
}
659+
660+
template <element::Type_t Type,
661+
typename T,
662+
typename StorageDataType = fundamental_type_for<Type>,
663+
typename std::enable_if<Type == element::Type_t::string, bool>::type = true>
664+
void fill_data(const T& value) {
665+
std::string type_name(typeid(value).name());
666+
OPENVINO_THROW("fill_data does not support to fill ov::Tensor of string type with value of " + type_name);
667+
}
668+
617669
template <element::Type_t Type,
618670
typename T,
619671
typename StorageDataType = fundamental_type_for<Type>,
@@ -658,7 +710,8 @@ class OPENVINO_API Constant : public Op {
658710
typename T,
659711
typename StorageDataType = fundamental_type_for<Type>,
660712
typename std::enable_if<Type != element::Type_t::nf4 && Type != element::Type_t::u1 &&
661-
Type != element::Type_t::u4 && Type != element::Type_t::i4,
713+
Type != element::Type_t::u4 && Type != element::Type_t::i4 &&
714+
Type != element::Type_t::string,
662715
bool>::type = true>
663716
void write_buffer(const std::vector<T>& source) {
664717
auto p = get_data_ptr_nc<Type>();
@@ -667,6 +720,31 @@ class OPENVINO_API Constant : public Op {
667720
}
668721
}
669722

723+
template <element::Type_t Type, typename std::enable_if<Type == element::Type_t::string, bool>::type = true>
724+
void write_buffer(const std::vector<std::string>& source) {
725+
// elements of string ov::Tensor is already pre-initialized in allocate_buffer
726+
auto p = get_data_ptr_nc<Type>();
727+
auto num_elements = std::min(shape_size(m_shape), source.size());
728+
std::uninitialized_copy_n(source.begin(), num_elements, p);
729+
}
730+
731+
template <element::Type_t Type, typename std::enable_if<Type != element::Type_t::string, bool>::type = true>
732+
void write_buffer(const std::vector<std::string>& source) {
733+
OPENVINO_THROW("write_buffer does not support writing std::string elements into ov::Tensor of type:" +
734+
ov::element::Type(Type).to_string());
735+
}
736+
737+
template <element::Type_t Type,
738+
typename T,
739+
typename std::enable_if<Type == element::Type_t::string, bool>::type = true>
740+
void write_buffer(const std::vector<T>& source) {
741+
if (source.size() > 0) {
742+
auto source_type = std::string(typeid(source[0]).name());
743+
OPENVINO_THROW("write_buffer does not support writing elements of type " + source_type +
744+
" into string ov::Tensor");
745+
}
746+
}
747+
670748
template <element::Type_t Type,
671749
typename T,
672750
typename StorageDataType = fundamental_type_for<Type>,
@@ -801,6 +879,9 @@ class OPENVINO_API Constant : public Op {
801879
case Type_t::nf4:
802880
write_buffer<Type_t::nf4>(source);
803881
break;
882+
case Type_t::string:
883+
write_buffer<Type_t::string>(source);
884+
break;
804885
case element::Type_t::undefined:
805886
case element::Type_t::dynamic:
806887
OPENVINO_THROW("unsupported type");

src/core/include/openvino/runtime/tensor.hpp

+2-2
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ class OPENVINO_API Tensor {
113113
* @note Does not perform memory allocation internally
114114
* @param type Tensor element type
115115
* @param shape Tensor shape
116-
* @param host_ptr Pointer to pre-allocated host memory
116+
* @param host_ptr Pointer to pre-allocated host memory with initialized objects
117117
* @param strides Optional strides parameters in bytes. Strides are supposed to be computed automatically based
118118
* on shape and element size
119119
*/
@@ -130,7 +130,7 @@ class OPENVINO_API Tensor {
130130
* @brief Constructs Tensor using port from node. Wraps allocated host memory.
131131
* @note Does not perform memory allocation internally
132132
* @param port port from node
133-
* @param host_ptr Pointer to pre-allocated host memory
133+
* @param host_ptr Pointer to pre-allocated host memory with initialized objects
134134
* @param strides Optional strides parameters in bytes. Strides are supposed to be computed automatically based
135135
* on shape and element size
136136
*/

0 commit comments

Comments
 (0)