encoding-free in-memory row representation #20017

fuyufjh · 2025-01-03T08:56:08Z

According to CPU profiling of a Join test case, a large portion of CPU usage was used to decode/encode datum in the CompactedRow.

As we know, CompactedRow was close to storage formats instead of in-memory format. The idea is, if we can design a new encoding-free in-memory row representation that is close to the StreamChunk or Datum, then we are able to save the cost of decoding & encoding, but rather directly use the reference to do any datum operation.

Context in Slack thread: https://risingwave-labs.slack.com/archives/C034TRPKN1F/p1733731711877629?thread_ts=1733464753.437989&cid=C034TRPKN1F
https://risingwave-labs.slack.com/archives/C034TRPKN1F/p1736135668504179

The text was updated successfully, but these errors were encountered:

BugenZhao · 2025-01-06T06:02:29Z

I'd like to help investigate.

fuyufjh added the type/feature label Jan 3, 2025

github-actions bot added this to the release-2.3 milestone Jan 3, 2025

BugenZhao self-assigned this Jan 6, 2025

lmatz mentioned this issue Jan 6, 2025

more efficient array builder #20028

Open

BugenZhao mentioned this issue Jan 6, 2025

refactor(streaming): remove get_compacted_row from StateTable #20034

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

encoding-free in-memory row representation #20017

encoding-free in-memory row representation #20017

fuyufjh commented Jan 3, 2025 •

edited by lmatz

Loading

BugenZhao commented Jan 6, 2025

encoding-free in-memory row representation #20017

encoding-free in-memory row representation #20017

Comments

fuyufjh commented Jan 3, 2025 • edited by lmatz Loading

BugenZhao commented Jan 6, 2025

fuyufjh commented Jan 3, 2025 •

edited by lmatz

Loading