FAQ: Postgres Direct Source #7281

morsapaes · 2021-07-01T12:09:43Z

morsapaes
Jul 1, 2021
Maintainer

🛑 This is a list of Frequently Asked Questions (FAQ) about how the Postgres Direct source works in the now unsupported binary version of Materialize (<v0.27.0). For an overview of the current behaviour of the source, check out the documentation for the latest version of Materialize.

SQL

Does Materialize handle all Postgres data types?

The current source implementation can handle all the data types that Materialize supports. If you happen to need a currently unsupported (and untracked) type, please open an issue here, prefixed with pg-cdc.

Operations

Can Postgres table sources be larger-than-memory?

Currently, no. All synced tables are materialized by default, so all data must fit into RAM. This is something we are working on improving!

Do deleted records in Postgres still take space in memory?

No.

On restart, does Materialize replicate all Postgres tables from scratch?

Currently, yes. Materialize doesn't provide persistent storage, so if you shut it down and restart it, the source will start from the beginning. This limitation will be lifted across the board for all sources once we have persistence!

Are upstream schema changes handled automatically?

The schema will remain static for the duration of the source, so after it is created no upstream changes are replicated. If a change is detected, Materialize will error the source until it can be recreated; for now, there isn't a better way to handle schema changes and avoid temporary disruptions. In order to lift this limitation, we first need to introduce support for DDL statements (e.g. ALTER TABLE).

For scenarios where the processing of the replication stream lags in Materialize, is there a way to guarantee the stability of the upstream Postgres database?

This is a known pain point in Postgres, and Materialize itself doesn't provide any mechanism to prevent it. From Postgres v13+, though, you can set a reasonable value for max_slot_wal_keep_size to limit the amount of storage used by replication slots; this is recommended for production setups. If the replication slot happens to fall behind and hit the defined limit, the upstream database will just start recycling WAL files and Materialize will stop replicating.

Other

Will Materialize itself be supported as a source?

This should be possible in the future, though it won’t be through the same mechanism. There are no plans to support logical replication as an output, but we'd like to support a CDC stream that can be ingested to connect Materialize instances to one another.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAQ: Postgres Direct Source #7281

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

FAQ: Postgres Direct Source #7281

morsapaes Jul 1, 2021 Maintainer

SQL

Operations

Other

Replies: 0 comments

morsapaes
Jul 1, 2021
Maintainer