You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently, Risingwave supports Iceberg as a source, which is limited to usage in batch queries. In contrast, both Spark and Flink allow streaming reads from append-only Iceberg sources. It is crucial for Risingwave to enhance its capabilities by supporting streaming reads on Iceberg, matching the functionality offered by other engines. The implementation of append-only Iceberg streaming reads is relatively straightforward, as it requires handling only the append commits while ignoring delete commits. This can be achieved by managing the streaming state table to track snapshots at a coarse-grained level or by monitoring the files within the snapshot at a finer granularity.
Moreover, the Iceberg engine table stands to gain from this enhancement. Presently, we maintain two copies of data for the Iceberg engine table. For append-only tables, we can simplify this by storing exclusively the Iceberg table data and eliminating the Hummock copy. Once the Hummock copy is removed, it would be essential to rely on Iceberg streaming reads to construct materialized views on top of the Iceberg engine table.
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Currently, Risingwave supports Iceberg as a source, which is limited to usage in batch queries. In contrast, both Spark and Flink allow streaming reads from append-only Iceberg sources. It is crucial for Risingwave to enhance its capabilities by supporting streaming reads on Iceberg, matching the functionality offered by other engines. The implementation of append-only Iceberg streaming reads is relatively straightforward, as it requires handling only the append commits while ignoring delete commits. This can be achieved by managing the streaming state table to track snapshots at a coarse-grained level or by monitoring the files within the snapshot at a finer granularity.
Moreover, the Iceberg engine table stands to gain from this enhancement. Presently, we maintain two copies of data for the Iceberg engine table. For append-only tables, we can simplify this by storing exclusively the Iceberg table data and eliminating the Hummock copy. Once the Hummock copy is removed, it would be essential to rely on Iceberg streaming reads to construct materialized views on top of the Iceberg engine table.
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: