Skip to content

Conversation

scottsand-db
Copy link
Collaborator

@scottsand-db scottsand-db commented Aug 18, 2025

🥞 Stacked PR

Use this link to review incremental changes.


Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

Block setting catalogManaged=supported during REPLACE operation. Also block a few other invalid operations, like disabling ICT on a catalogManaged table, whether it is during an UPDATE or a REPLACE.

How was this patch tested?

New UT.

Does this PR introduce any user-facing changes?

No.

@scottsand-db scottsand-db changed the title done first pass [Kernel] Block setting catalogManaged=supported during REPLACE operation Aug 18, 2025
@scottsand-db scottsand-db self-assigned this Aug 18, 2025
Copy link
Collaborator

@xzhseh xzhseh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left some nits + clarification quesitons.

{
add(TableConfig.COLUMN_MAPPING_MAX_COLUMN_ID.getKey());
add(TableConfig.IN_COMMIT_TIMESTAMPS_ENABLED.getKey());
// TODO are there any other table properties we should preserve?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering do we have any available docs that track the interactions between different table features <-> DDL commands? If not, should we start consolidating one to align on the intended semantics?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @xzhseh -- can you give some examples of such interactions? Here is a concrete example of two table features (ccv2, ICT) and REPLACE. Any others? Agreed we should track this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @xzhseh -- can you give some examples of such interactions? Here is a concrete example of two table features (ccv2, ICT) and REPLACE. Any others? Agreed we should track this.

E.g., CCv2's specific semantics with CLONE / REPLACE / CREATE, etc.

  • For CLONE we won't preserve CCv2 unless explicitly specified.
  • For REPLACE we retain the existing CCv2 / non-CCv2 properties.
  • Also there are other cases when handling dependent features such as ICT...

And for ICT and other table features they have their own specific treatments w.r.t. these DDL commands that should be explicitly documented / tracked.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for elaborating! I think each component of the code should track this, at the very least. For example in this PR, for ICT + REPLACE, we document this here: https://github.com/delta-io/delta/pull/5156/files#diff-d3436360ac15655035f35b01d8e7c2bc27db976db99b04b1709ed90d64be372aR54-R61

We should at the very least do that. Also, if you happen to have a list of these such semantics in your head, perhaps we should work to document them in some README.md somewhere?

cc @nicklan and @raveeram-db for some context

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if you happen to have a list of these such semantics in your head, perhaps we should work to document them in some README.md somewhere?

IMO there would be two ways to do it:

  • Maintain a delta kernel table features <> DDL commands track list somewhere as the source of truth for all table feature implementations in kernel.
  • Or, in a more general (and breaking) way, update the delta protocol to include the specific DDL-related semantics of each table feature in its corresponding section.

@scottsand-db scottsand-db force-pushed the stack/kernel_block_replace_while_enabling_catalogManaged branch from 7e57e6e to dd5e30d Compare September 2, 2025 15:54
@scottsand-db scottsand-db force-pushed the stack/kernel_block_replace_while_enabling_catalogManaged branch from dd5e30d to cc725a8 Compare September 2, 2025 16:14
@scottsand-db
Copy link
Collaborator Author

Pausing this. Want to merge #5156 first.

@scottsand-db scottsand-db changed the title [Kernel] Block setting catalogManaged=supported during REPLACE operation [Merge after #5156] [Kernel] Block setting catalogManaged=supported during REPLACE operation Sep 4, 2025
@scottsand-db scottsand-db marked this pull request as draft September 4, 2025 21:59
@scottsand-db scottsand-db force-pushed the stack/kernel_block_replace_while_enabling_catalogManaged branch from cc725a8 to 4b900a0 Compare September 5, 2025 18:49
@scottsand-db scottsand-db changed the title [Merge after #5156] [Kernel] Block setting catalogManaged=supported during REPLACE operation [Kernel] Block setting catalogManaged=supported during REPLACE operation Sep 5, 2025
@scottsand-db scottsand-db marked this pull request as ready for review September 5, 2025 18:50
operationType = "UPDATE",
initialTableProperties = Map("delta.enableInCommitTimestamps" -> "false"),
transactionProperties = Map("delta.feature.catalogOwned-preview" -> "supported"),
expectedSuccess = true,
expectedIctEnabled = true,
expectedCatalogManagedSupported = true),
CatalogManagedEnablementTestCase(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just moved this up. Also, I clean up some of the test case names to be more succinct and clearer

expectedCatalogManagedSupported = false),

// ===== REPLACE cases =====
CatalogManagedTestCase(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New test cases

@scottsand-db scottsand-db changed the title [Kernel] Block setting catalogManaged=supported during REPLACE operation [Kernel] Block unsupported catalogManaged enablement / ICT disablement on catalogManaged table scenarios Sep 5, 2025
@scottsand-db scottsand-db force-pushed the stack/kernel_block_replace_while_enabling_catalogManaged branch from 4b900a0 to f161792 Compare September 5, 2025 18:56
@scottsand-db scottsand-db requested a review from xzhseh September 5, 2025 18:56
Copy link
Collaborator

@allisonport-db allisonport-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

if (!txnExplicitlyDisablesICT) {
return;
}

// Case 2: Txn is explicitly disabling ICT on a catalogManaged table. Throw.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For kernel - is it true that we don't currently have a generic interface that automatically prevents dependent features being turned off? The current implementation LGTM but we may want to generalize it a bit more..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have that yet, yes.

We are able to say which table features require other table features --

return Collections.singleton(IN_COMMIT_TIMESTAMP_W_FEATURE);

But this has to do with the protocol, which is different from the enablement flag.

It would be nice to be able to additionally specify that not only does this protocol feature X require another protocol Y to be supported, but Y must also be enabled, and then you cannot disable Y, too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this has to do with the protocol, which is different from the enablement flag.

Yeah unless we also introduce the metadata enablement flag <> table feature mappings.. E.g., we know table feature X is automatically supported/disabled when the X-enablement-flag is turned on/off.

@@ -19,25 +19,21 @@ package io.delta.kernel.defaults.catalogManaged
import scala.collection.JavaConverters._
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker for this PR: Since we are also including DDL commands <> CCv2 semantics, I feel like we need a separate CatalogManagedPropertySuite to test all the interactions. Also we need to check for ucTableId as we currently store it in delta log.

Copy link
Collaborator Author

@scottsand-db scottsand-db Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on having a CatalogManagedPropertySuite! Will add that to the tracker.

Also we need to check for ucTableId as we currently store it in delta log.

Great question / callout! That is UC specific -- and is completely outside the scope of Delta Kernel! That would be done in the delta-unity JAR (in this repo) -- but do remember that Kernel never cares what type of catalogManaged table/catalog (UC, glue, etc.) it is reading/talking to.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, we should add some checks + tests in delta-unity that prevent us from removing the uc table ID property perhaps?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is UC specific -- and is completely outside the scope of Delta Kernel!

Ah yes, let's add those UTs in delta-unity then.

For example, we should add some checks + tests in delta-unity that prevent us from removing the uc table ID property perhaps?

Exactly. We should throw exception(s) whenever someone tries to (explicitly) modify ucTableId via any DDL commands.

@scottsand-db scottsand-db merged commit 337456f into delta-io:master Sep 9, 2025
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants