Skip to content

Commit

Permalink
Update dynamic_quantize_gpu_kv_cache.cl
Browse files Browse the repository at this point in the history
  • Loading branch information
p-durandin authored Jan 15, 2025
1 parent 317638e commit 1fbd98f
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ KERNEL(dynamic_quantize_gpu_kv_cache)(
max_value = work_group_reduce_max(max_value);

// If the range of input data is zero, it is adjusted to the minimum value(0.001).
half diff_value = max_value == min_value ? (grp_max) : (max_value - min_value);
ACCUMULATOR_TYPE diff_value = max_value == min_value ? (grp_max) : (max_value - min_value);
ACCUMULATOR_TYPE scale_tmp = (ACCUMULATOR_TYPE)((CHAR_MAX - CHAR_MIN) / diff_value);
ACCUMULATOR_TYPE zp_tmp = (ACCUMULATOR_TYPE)(-min_value * scale_tmp) + CHAR_MIN;
OUTPUT1_TYPE scale = (OUTPUT1_TYPE)(scale_tmp);
Expand Down

0 comments on commit 1fbd98f

Please sign in to comment.