Skip to content

Latest commit

 

History

History
48 lines (29 loc) · 817 Bytes

hadd.adoc

File metadata and controls

48 lines (29 loc) · 817 Bytes

hadd, rhadd

Returns (x + y) >> 1, or (x + y + 1) >> 1

gentype hadd(gentype x,
             gentype y)

gentype rhadd(gentype x,
              gentype y)

Description

hadd returns (x+y) >> 1. The intermediate sum does not modulo overflow.

rhadd returns (x+y+1) >> 1. The intermediate sum does not modulo overflow.

Frequently vector operations need n + 1 bits temporarily to calculate a result. The rhadd instruction gives you an extra bit without needing to upsample and downsample. This can be a profound performance win.