@@ -33,11 +33,56 @@ one, the environment variable can be set to `cuda:*`.
33
33
34
34
# Supported Primitives
35
35
36
+ General limitations:
37
+
38
+ * Currently blocked formats are not supported by any implementation unless
39
+ explicitly listed
40
+ * There's a limit of maximum 5 post-ops for the implementations
41
+ * The maximum supported size of any dimension of any input/output tensor of a
42
+ primitive is ` INT32_MAX `
43
+
36
44
## Batch Normalization
37
45
38
46
The implementation supports both forward and backward directions.
39
47
40
48
* Supported formats: ` NCDHW ` , ` NDHWC ` , ` NCHW ` , ` NHWC ` , ` NCW ` , ` NWC ` , ` NC `
49
+ * Supported data types
50
+ * Forward direction: ` f32 ` , ` bf16 ` , ` f16 ` , ` s8 `
51
+ * Backward direction: ` f32 ` , ` bf16 ` , ` f16 `
52
+
53
+ ## Binary
54
+
55
+ * Supported formats: plain formats, ` Ab32a ` , ` aBc32b `
56
+ * Supported data types: ` f32 ` , ` bf16 ` , ` f16 ` , ` s8 ` , ` u8 ` , ` s32 `
57
+
58
+ ## Convolution
59
+
60
+ The implementation supports forward, backward data and backward weights
61
+ directions.
62
+
63
+ * Supported input/output formats: plain formats
64
+ * Supported weights formats: ` goiw ` , ` goihw ` , ` goidhw ` , ` oiw ` , ` oihw ` , ` oidhw ` ;
65
+ * Supported data types: ` f32 ` , ` bf16 ` , ` f16 ` , ` s32 ` , ` s8 ` , ` u8 `
66
+ * Limitations
67
+ * Some very large problem sizes currently return ` unimplemented ` due to an
68
+ issue with long execution times
69
+
70
+ ## Concat
71
+
72
+ * Supported formats: plain formats
73
+ * Supported data types: ` f32 ` , ` bf16 ` , ` f16 ` , ` s8 ` , ` s32 `
74
+
75
+ ## Deconvolution
76
+
77
+ The implementation supports forward and backward data and backward weights
78
+ directions.
79
+
80
+ * Supported input/output formats: plain formats
81
+ * Supported weights formats: ` goiw ` , ` goihw ` , ` goidhw ` , ` oiw ` , ` oihw ` , ` oidhw ` ;
82
+ * Supported data types: ` f32 ` , ` bf16 ` , ` f16 ` , ` s32 ` , ` s8 ` , ` u8 `
83
+ * Limitations
84
+ * Some very large problem sizes currently return ` unimplemented ` due to an
85
+ issue with long execution times
41
86
42
87
## Eltwise
43
88
@@ -47,50 +92,77 @@ The implementation supports both forward and backward directions.
47
92
` gelu_tanh ` , ` hardsigmoid ` , ` hardswish ` , ` linear ` , ` log ` , ` logistic ` , ` mish ` ,
48
93
` pow ` , ` relu ` , ` round ` , ` soft_relu ` , ` sqrt ` , ` square ` ,` swish ` and ` tanh `
49
94
* Supported formats: ` NCDHW ` , ` NDHWC ` , ` NCHW ` , ` NHWC ` , ` NCW ` , ` NWC ` , ` NC ` , ` N `
95
+ * Supported data types: ` f32 ` , ` bf16 ` , ` f16 ` , ` s32 ` , ` s8 ` , ` u8 `
96
+
97
+ ## Layer Normalization
98
+
99
+ The implementation supports both forward and backward directions.
100
+
101
+ * Supported formats: ` NCDHW ` , ` NDHWC ` , ` NCHW ` , ` NHWC ` , ` NCW ` , ` NWC ` , ` NC `
102
+ * Supported input/output data types for forward direction: ` f32 ` , ` bf16 ` , ` f16 ` ,
103
+ ` s8 ` , ` u8 `
104
+ * Supported input/output data types for backward direction: ` f32 ` , ` bf16 `
105
+ * Supported scale/shift data types: ` f32 ` , ` bf16 ` , ` f16 `
50
106
51
107
## LRN
52
108
53
109
The implementation supports both forward and backward directions.
54
110
55
111
* Supported formats: ` NCDHW ` , ` NDHWC ` , ` NCHW ` , ` NHWC ` , ` NCW ` , ` NWC ` , ` NC `
112
+ * Supported data types: ` f32 ` , ` bf16 ` , ` f16 `
113
+
114
+ ## Matmul
115
+
116
+ * Supported formats: plain formats
117
+ * Supported input/output data types: ` f32 ` , ` bf16 ` , ` f16 ` , ` s8 ` , ` u8 ` , ` s32 `
118
+ * Limitations
119
+ * Runtime dims is not supported
120
+ * PReLU post-op is not supported
56
121
57
122
## Pooling
58
123
59
124
The implementation supports both forward and backward directions.
60
125
61
126
* Supported formats: ` NCDHW ` , ` NDHWC ` , ` NCHW ` , ` NHWC ` , ` NCW ` , ` NWC `
127
+ * Supported data types for forward direction: f32, bf16, f16, s8, u8
128
+ * Supported data types for backward direction: f32, bf16, f16
62
129
63
130
## PReLU
64
131
65
132
The implementation supports both forward and backward propagations.
66
133
67
134
* Supported formats: ` NCDHW ` , ` NDHWC ` , ` NCHW ` , ` NHWC ` , ` NCW ` , ` NWC ` , ` NC `
68
-
69
- * Forward pass supports ` f32 ` , ` f16 ` , ` bf16 ` , ` s8 ` and ` u8 ` data types
70
- * Backward pass supports ` f32 ` and ` bf16 ` data types
135
+ * Supported data types ` f32 ` , ` f16 ` , ` bf16 ` , ` s8 ` and ` u8 ` data types
71
136
72
137
## Reorder
73
138
74
- * Format support limitations: blocked formats are not supported
139
+ * Supported formats: plain formats
75
140
* Supported data types: ` f32 ` , ` bf16 ` , ` f16 ` , ` s8 ` , ` u8 `
76
141
77
142
## Resampling
78
143
79
144
The implementation supports both forward and backward directions.
80
145
81
146
* Supported formats: ` NCDHW ` , ` NDHWC ` , ` NCHW ` , ` NHWC ` , ` NCW ` , ` NWC `
147
+ * Supported data types: ` f32 ` , ` bf16 ` , ` f16 ` , ` s32 ` , ` s8 ` , ` u8 `
82
148
83
149
## Softmax/LogSoftmax
84
150
85
151
The implementation supports both forward and backward directions.
86
152
87
153
* Supported formats: ` NCDHW ` , ` NDHWC ` , ` NCHW ` , ` NHWC ` , ` NCW ` , ` NWC ` , ` NC `
154
+ * Supported data types for forward direction: ` f32 ` , ` bf16 ` , ` f16 ` , ` s8 ` , ` u8 `
155
+ * Supported data types for backward direction: ` f32 ` , ` bf16 ` , ` f16 `
88
156
89
157
## Shuffle
90
158
91
159
The implementation supports both forward and backward propagations.
92
160
93
161
* Supported formats: ` NCDHW ` , ` NDHWC ` , ` NCHW ` , ` NHWC ` , ` NCW ` , ` NWC ` , ` NC `
94
-
95
162
* Forward pass supports ` f32 ` , ` f16 ` , ` bf16 ` and ` s8 ` data types.
96
163
* Backward pass supports ` f32 ` and ` bf16 ` data types.
164
+
165
+ ## Sum
166
+
167
+ * Supported formats: plain formats with up to 7 dimensions
168
+ * Supported data types: ` f32 ` , ` bf16 ` , ` f16 ` , ` s8 ` , ` u8 `
0 commit comments