feat: support streaming read on append-only iceberg source #20074

chenzl25 · 2025-01-08T07:30:24Z

Is your feature request related to a problem? Please describe.

Currently, Risingwave supports Iceberg as a source, which is limited to usage in batch queries. In contrast, both Spark and Flink allow streaming reads from append-only Iceberg sources. It is crucial for Risingwave to enhance its capabilities by supporting streaming reads on Iceberg, matching the functionality offered by other engines. The implementation of append-only Iceberg streaming reads is relatively straightforward, as it requires handling only the append commits while ignoring delete commits. This can be achieved by managing the streaming state table to track snapshots at a coarse-grained level or by monitoring the files within the snapshot at a finer granularity.

Moreover, the Iceberg engine table stands to gain from this enhancement. Presently, we maintain two copies of data for the Iceberg engine table. For append-only tables, we can simplify this by storing exclusively the Iceberg table data and eliminating the Hummock copy. Once the Hummock copy is removed, it would be essential to rely on Iceberg streaming reads to construct materialized views on top of the Iceberg engine table.

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

chenzl25 added the type/feature label Jan 8, 2025

github-actions bot added this to the release-2.3 milestone Jan 8, 2025

chenzl25 assigned xxchan Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support streaming read on append-only iceberg source #20074

feat: support streaming read on append-only iceberg source #20074

chenzl25 commented Jan 8, 2025 •

edited

Loading

feat: support streaming read on append-only iceberg source #20074

feat: support streaming read on append-only iceberg source #20074

Comments

chenzl25 commented Jan 8, 2025 • edited Loading

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

chenzl25 commented Jan 8, 2025 •

edited

Loading