Skip to content

Commit 8f0d89d

Browse files
authored
[GPU] Fixed fc tile size for better perf for small N size (4096) kernels (openvinotoolkit#25759)
### Details: - [GPU] Fixed fc tile size for better perf for small N size (4096) kernels in MTL - To get 3-5% gain on 2nd token latency on MTL ![image](https://github.com/user-attachments/assets/9da816ab-b104-4dcb-b8d2-e816068589fb) ### Tickets: - *ticket-id*
1 parent 2ff7bfc commit 8f0d89d

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/plugins/intel_gpu/src/kernel_selector/kernels/fully_connected/fully_connected_kernel_bf_tiled.cpp

+1-1
Original file line numberDiff line numberDiff line change
@@ -310,7 +310,7 @@ FullyConnected_bf_tiled::GetAutoTuneParams(const fully_connected_params& params,
310310
if (!params.is_shape_agnostic && batch == 1) {
311311
// Tuning for Meteor Lake
312312
size_t min_num_threads = params.engineInfo.computeUnitsCount * simd;
313-
if (output_f / 2 < min_num_threads && params.weights.GetLayout() == WeightsLayout::os_is_yx_osv32_isv2) {
313+
if (output_f / 2 <= min_num_threads && params.weights.GetLayout() == WeightsLayout::os_is_yx_osv32_isv2) {
314314
GPU_DEBUG_TRACE_DETAIL << "FC bf tiled: Set ofm_tile 1. (output_f : " << output_f
315315
<< ", computeUnitsCount : " << params.engineInfo.computeUnitsCount
316316
<< " min_num_threads : " << min_num_threads << ")" << std::endl;

0 commit comments

Comments
 (0)