-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[Kernel] Block unsupported catalogManaged enablement / ICT disablement on catalogManaged table scenarios #5100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
catalogManaged=supported
during REPLACE operation
kernel/kernel-api/src/main/java/io/delta/kernel/internal/TransactionMetadataFactory.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, left some nits + clarification quesitons.
kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
Outdated
Show resolved
Hide resolved
.../kernel-api/src/main/java/io/delta/kernel/internal/ReplaceTableTransactionBuilderV2Impl.java
Outdated
Show resolved
Hide resolved
{ | ||
add(TableConfig.COLUMN_MAPPING_MAX_COLUMN_ID.getKey()); | ||
add(TableConfig.IN_COMMIT_TIMESTAMPS_ENABLED.getKey()); | ||
// TODO are there any other table properties we should preserve? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering do we have any available docs that track the interactions between different table features <-> DDL commands? If not, should we start consolidating one to align on the intended semantics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @xzhseh -- can you give some examples of such interactions? Here is a concrete example of two table features (ccv2, ICT) and REPLACE. Any others? Agreed we should track this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @xzhseh -- can you give some examples of such interactions? Here is a concrete example of two table features (ccv2, ICT) and REPLACE. Any others? Agreed we should track this.
E.g., CCv2's specific semantics with CLONE / REPLACE / CREATE, etc.
- For CLONE we won't preserve CCv2 unless explicitly specified.
- For REPLACE we retain the existing CCv2 / non-CCv2 properties.
- Also there are other cases when handling dependent features such as ICT...
And for ICT and other table features they have their own specific treatments w.r.t. these DDL commands that should be explicitly documented / tracked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Thanks for elaborating! I think each component of the code should track this, at the very least. For example in this PR, for ICT + REPLACE, we document this here: https://github.com/delta-io/delta/pull/5156/files#diff-d3436360ac15655035f35b01d8e7c2bc27db976db99b04b1709ed90d64be372aR54-R61
We should at the very least do that. Also, if you happen to have a list of these such semantics in your head, perhaps we should work to document them in some README.md somewhere?
cc @nicklan and @raveeram-db for some context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if you happen to have a list of these such semantics in your head, perhaps we should work to document them in some README.md somewhere?
IMO there would be two ways to do it:
- Maintain a delta kernel table features <> DDL commands track list somewhere as the source of truth for all table feature implementations in kernel.
- Or, in a more general (and breaking) way, update the delta protocol to include the specific DDL-related semantics of each table feature in its corresponding section.
kernel/kernel-api/src/main/java/io/delta/kernel/internal/TransactionMetadataFactory.java
Outdated
Show resolved
Hide resolved
kernel/kernel-api/src/main/java/io/delta/kernel/internal/TransactionMetadataFactory.java
Show resolved
Hide resolved
...s/src/test/scala/io/delta/kernel/defaults/catalogManaged/CatalogManagedEnablementSuite.scala
Show resolved
Hide resolved
...s/src/test/scala/io/delta/kernel/defaults/catalogManaged/CatalogManagedEnablementSuite.scala
Show resolved
Hide resolved
7e57e6e
to
dd5e30d
Compare
dd5e30d
to
cc725a8
Compare
.../kernel-api/src/main/java/io/delta/kernel/internal/ReplaceTableTransactionBuilderV2Impl.java
Outdated
Show resolved
Hide resolved
...s/src/test/scala/io/delta/kernel/defaults/catalogManaged/CatalogManagedEnablementSuite.scala
Show resolved
Hide resolved
...s/src/test/scala/io/delta/kernel/defaults/catalogManaged/CatalogManagedEnablementSuite.scala
Show resolved
Hide resolved
Pausing this. Want to merge #5156 first. |
catalogManaged=supported
during REPLACE operationcatalogManaged=supported
during REPLACE operation
cc725a8
to
4b900a0
Compare
catalogManaged=supported
during REPLACE operationcatalogManaged=supported
during REPLACE operation
operationType = "UPDATE", | ||
initialTableProperties = Map("delta.enableInCommitTimestamps" -> "false"), | ||
transactionProperties = Map("delta.feature.catalogOwned-preview" -> "supported"), | ||
expectedSuccess = true, | ||
expectedIctEnabled = true, | ||
expectedCatalogManagedSupported = true), | ||
CatalogManagedEnablementTestCase( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just moved this up. Also, I clean up some of the test case names to be more succinct and clearer
expectedCatalogManagedSupported = false), | ||
|
||
// ===== REPLACE cases ===== | ||
CatalogManagedTestCase( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New test cases
catalogManaged=supported
during REPLACE operation4b900a0
to
f161792
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if (!txnExplicitlyDisablesICT) { | ||
return; | ||
} | ||
|
||
// Case 2: Txn is explicitly disabling ICT on a catalogManaged table. Throw. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For kernel - is it true that we don't currently have a generic interface that automatically prevents dependent features being turned off? The current implementation LGTM but we may want to generalize it a bit more..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have that yet, yes.
We are able to say which table features require other table features --
delta/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
Line 92 in a0826e3
return Collections.singleton(IN_COMMIT_TIMESTAMP_W_FEATURE); |
But this has to do with the protocol, which is different from the enablement flag.
It would be nice to be able to additionally specify that not only does this protocol feature X require another protocol Y to be supported, but Y must also be enabled, and then you cannot disable Y, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this has to do with the protocol, which is different from the enablement flag.
Yeah unless we also introduce the metadata enablement flag <> table feature mappings.. E.g., we know table feature X is automatically supported/disabled when the X-enablement-flag is turned on/off.
@@ -19,25 +19,21 @@ package io.delta.kernel.defaults.catalogManaged | |||
import scala.collection.JavaConverters._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker for this PR: Since we are also including DDL commands <> CCv2 semantics, I feel like we need a separate CatalogManagedPropertySuite
to test all the interactions. Also we need to check for ucTableId
as we currently store it in delta log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on having a CatalogManagedPropertySuite! Will add that to the tracker.
Also we need to check for ucTableId as we currently store it in delta log.
Great question / callout! That is UC specific -- and is completely outside the scope of Delta Kernel! That would be done in the delta-unity
JAR (in this repo) -- but do remember that Kernel never cares what type of catalogManaged table/catalog (UC, glue, etc.) it is reading/talking to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, we should add some checks + tests in delta-unity that prevent us from removing the uc table ID property perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is UC specific -- and is completely outside the scope of Delta Kernel!
Ah yes, let's add those UTs in delta-unity
then.
For example, we should add some checks + tests in delta-unity that prevent us from removing the uc table ID property perhaps?
Exactly. We should throw exception(s) whenever someone tries to (explicitly) modify ucTableId
via any DDL commands.
🥞 Stacked PR
Use this link to review incremental changes.
Which Delta project/connector is this regarding?
Description
Block setting catalogManaged=supported during REPLACE operation. Also block a few other invalid operations, like disabling ICT on a catalogManaged table, whether it is during an UPDATE or a REPLACE.
How was this patch tested?
New UT.
Does this PR introduce any user-facing changes?
No.