Skip to content

Conversation

kevintang2022
Copy link

No description provided.

@prestodb-ci prestodb-ci added the from:Meta PRs from Meta label Aug 5, 2025
@pdabre12
Copy link
Contributor

pdabre12 commented Aug 5, 2025

@kevintang2022
Just a very basic question, so with this change all the fns from the sidecar would be under presto.default namespace? Where does this new BuiltInNativeFunctionNamespaceManager fit with respect to the NativeFunctionNamespaceManager? (the function namespace manager which currently holds the sidecar functions)

@kevintang2022
Copy link
Author

@kevintang2022 Just a very basic question, so with this change all the fns from the sidecar would be under presto.default namespace? Where does this new BuiltInNativeFunctionNamespaceManager fit with respect to the NativeFunctionNamespaceManager? (the function namespace manager which currently holds the sidecar functions)

Adding this should have no effect on the NativeFunctionNamespaceManager. NativeFunctionNamespaceManager is still the serving namespace manager for native.default.

The reason it is needed is because we want users to be able to call native functions without specifying the full qualified name. It seems in the current implementation of sidecar, users need to fully type native.default.function_name to invoke a C++ function if they are still using presto.default. as default namespace

@pdabre12
Copy link
Contributor

pdabre12 commented Aug 5, 2025

Thanks, I see.
However, switching of the namespaces from presto.default to native.default is something that is needed for sidecar-enabled native clusters. Why would we still be using presto.default namespace in a sidecar enabled native cluster?

@kevintang2022
Copy link
Author

kevintang2022 commented Aug 5, 2025

Thanks, I see. However, switching of the namespaces from presto.default to native.default is something that is needed for sidecar-enabled native clusters. Why would we still be using presto.default namespace in a sidecar enabled native cluster?

Actually, it's not required to use native.default with sidecar enabled. You can have presto.default and have sidecar enabled, and you can call the native functions by using fully qualified name.

The reason this is needed is because constant folding is not quite ready to be used with the sidecar functions. This means that we still need to use coordinator implementation in some cases, but use worker implementation in some other cases.

You're right that if we just want to always use the worker implementation, then NativeFunctionNamespaceManager will work as is. However, we need to have some selective logic for the use case of sometimes using Java implementation and sometimes using C++ implementation. Also, we don't want users to have to specify the full qualified function name

@pdabre12
Copy link
Contributor

pdabre12 commented Aug 5, 2025

@kevintang2022 Got it, the use case is clear to me now. Thanks for clarifying.

Actually, it's not required to use native.default with sidecar enabled. You can have presto.default and have sidecar enabled, and you can call the native functions by using fully qualified name.

Have you tested the scenario where this exact setup works? Prestissimo doesn't support multiple namespaces. The functions that we report from Prestissimo need to be under a namespace, in the case where sidecar is enabled it is native.default. This namespace is configurable by the worker configuration property presto.default-namespace. Note that this property exists on the coordinator as well, which we won't be setting (default is presto.default).

Additionally another question is, when you run queries let's say abs() without the fully qualified namespace, it will resolve it to the presto.default namespace and ideally we think this should be constant folded, but what if it wasn't constant folded? If it wasn't constant folded , we would send over that function with name as presto.default.abs over to the C++ worker side. Now, we set the worker configuration property presto.default-namespace to native.default and hence we would run into a function not registered error.

* Function with only C++ implementation: no constant fold
* Function with java and c++ implementation: constant fold, existing behavior
* Function with only SQL implementation: use sql implementation, existing behavior
* Function with SQL and C++ implementation: use c++ implementation, new behavior
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function with SQL and C++ implementation: use c++ implementation, new behavior.

Do we have an example of this implementation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map_remove_null_values has a SQL implementation in coordinator and C++ implementation from worker

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map_remove_null_values has a SQL implementation in coordinator and C++ implementation from worker

@kevintang2022
Copy link
Author

Have you tested the scenario where this exact setup works? Prestissimo doesn't support multiple namespaces. The functions that we report from Prestissimo need to be under a namespace, in the case where sidecar is enabled it is native.default. This namespace is configurable by the worker configuration property presto.default-namespace. Note that this property exists on the coordinator as well, which we won't be setting (default is presto.default).

Yes, I tested this with presto.default namespace as default config for both coordinator and worker. In this scenario, it's not possible to run native.defauolt functions, but everything else will be the same as if the sidecar was not enabled. This is because for most cases, we just want existing behavior. It's only for very few cases where we would want to change existing behavior (like C++ implementation overriding a SQL implementation)

Additionally another question is, when you run queries let's say abs() without the fully qualified namespace, it will resolve it to the presto.default namespace and ideally we think this should be constant folded, but what if it wasn't constant folded? If it wasn't constant folded , we would send over that function with name as presto.default.abs over to the C++ worker side. Now, we set the worker configuration property presto.default-namespace to native.default and hence we would run into a function not registered error.

No, this would not throw error because we never use native.default. It will be presto.default for both coordinator and worker.

@pdabre12
Copy link
Contributor

pdabre12 commented Aug 6, 2025

Yes, I tested this with presto.default namespace as default config for both coordinator and worker. In this scenario, it's not possible to run native.defauolt functions, but everything else will be the same as if the sidecar was not enabled. This is because for most cases, we just want existing behavior. It's only for very few cases where we would want to change existing behavior (like C++ implementation overriding a SQL implementation)

I see this is for a specific use case, where in you don't want to use the native.default functions at all, but still pull in functions from the sidecar under the presto.default namespace. What this will enable you to do is use the native overridden implementations for the sql invoked fns and also give you access to the underlying native functions which don't have implementations in Java. Is my understanding correct?

Instead of adding a BuiltInPluginFunctionNamespaceManager, add a BuiltInNativeFunctionNamespaceManager. Register the functions from native sidecar under the presto.default namespace. During function evaluation inside of FunctionAndTypeManager#resolveFunctionInternal and getMatchingFunctionHandle, add some logic to determine whether the function in BuiltInTypeAndFunctionNamespaceManager or the function in BuiltInNativeFunctionNamespaceManager is used. This is similar to the PR, except the logic in PR will throw an error if it sees that there is a conflict. If there is a conflict, follow this logic to determine which functions to use:


* If the one already inside the BuiltInTypeAndFunctionNamespaceManager is a Java implementation, then use the one in BuiltInTypeAndFunctionNamespaceManager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you don't have access to the FunctionMetadata at this point to determine its implementation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within functionAndTypeManager, we should be able to access the functionMetadata to determine the implementation type.

https://github.com/prestodb/presto/pull/25597/files
Add something like this in the getMatchingFunctions

            FunctionHandle functionHandle1 = functionNamespaceManager.getFunctionHandle(transactionHandle, matchingDefaultFunctionSignature.get());
            FunctionHandle functionHandle2 = builtInPluginFunctionNamespaceManager.getFunctionHandle(matchingPluginFunctionSignature.get());

// can read the function implementation type like this
functionNamespaceManager.getFunctionMetadata(functionHandle1).getImplementationType()

@kevintang2022
Copy link
Author

Yes, I tested this with presto.default namespace as default config for both coordinator and worker. In this scenario, it's not possible to run native.defauolt functions, but everything else will be the same as if the sidecar was not enabled. This is because for most cases, we just want existing behavior. It's only for very few cases where we would want to change existing behavior (like C++ implementation overriding a SQL implementation)

I see this is for a specific use case, where in you don't want to use the native.default functions at all, but still pull in functions from the sidecar under the presto.default namespace. What this will enable you to do is use the native overridden implementations for the sql invoked fns and also give you access to the underlying native functions which don't have implementations in Java. Is my understanding correct?

Yes that describes our use case

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This RFC doesn't have an adoption plan, which is a part of the template. Can you add details on the eventual deprecation and removal of it (once constant folding fully delegates to C++ evaluation)?

@kevintang2022 kevintang2022 force-pushed the native-built-in-functions branch from efed2f4 to 0245089 Compare August 7, 2025 15:58
@kevintang2022 kevintang2022 requested a review from tdcmeehan August 7, 2025 15:59
Copy link

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this writeup.

@kevintang2022 kevintang2022 force-pushed the native-built-in-functions branch from 0245089 to 9af6017 Compare August 8, 2025 14:57
kevintang2022 added a commit to kevintang2022/presto that referenced this pull request Aug 11, 2025
Summary:
This PR includes changes for Built in functions for different use cases
- Register SQL invoked function in built in namespace when they come from plugins (prestodb#25597)
- Register Native functions in built in namespace when they come from sidecar (prestodb/rfcs#41)

Since these use cases have a lot of overlap, we can introduce an abstract class in order to reduce duplicate logic.

## Description

## Motivation and Context

## Impact

Pull Request resolved: prestodb#25661

Test Plan:
<!---Please fill in how you tested your change-->

## Contributor checklist

- [ ] Please make sure your submission complies with our [contributing guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md), in particular [code style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style) and [commit standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards).
- [ ] PR description addresses the issue accurately and concisely.  If the change is non-trivial, a GitHub Issue is referenced.
- [ ] Documented new properties (with its default value), SQL syntax, functions, or other functionality.
- [ ] If release notes are required, they follow the [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines).
- [ ] Adequate tests were added if applicable.
- [ ] CI passed.

## Release Notes
Please follow [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines) and fill in the release notes below.

```
== RELEASE NOTES ==

General Changes
* ...
* ...

Hive Connector Changes
* ...
* ...
```

If release note is NOT required, use:

```
== NO RELEASE NOTE ==
```

Rollback Plan:

Differential Revision: D79301760

Pulled By: kevintang2022
kevintang2022 added a commit to kevintang2022/presto that referenced this pull request Aug 11, 2025
Summary:
This PR includes changes for Built in functions for different use cases
- Register SQL invoked function in built in namespace when they come from plugins (prestodb#25597)
- Register Native functions in built in namespace when they come from sidecar (prestodb/rfcs#41)

Since these use cases have a lot of overlap, we can introduce an abstract class in order to reduce duplicate logic.

## Description

## Motivation and Context

## Impact

Pull Request resolved: prestodb#25661

Test Plan:
<!---Please fill in how you tested your change-->

## Contributor checklist

- [ ] Please make sure your submission complies with our [contributing guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md), in particular [code style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style) and [commit standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards).
- [ ] PR description addresses the issue accurately and concisely.  If the change is non-trivial, a GitHub Issue is referenced.
- [ ] Documented new properties (with its default value), SQL syntax, functions, or other functionality.
- [ ] If release notes are required, they follow the [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines).
- [ ] Adequate tests were added if applicable.
- [ ] CI passed.

## Release Notes
Please follow [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines) and fill in the release notes below.

```
== RELEASE NOTES ==

General Changes
* ...
* ...

Hive Connector Changes
* ...
* ...
```

If release note is NOT required, use:

```
== NO RELEASE NOTE ==
```

Rollback Plan:

Differential Revision: D79301760

Pulled By: kevintang2022
kevintang2022 added a commit to kevintang2022/presto that referenced this pull request Aug 12, 2025
Summary:
This PR includes changes for Built in functions for different use cases
- Register SQL invoked function in built in namespace when they come from plugins (prestodb#25597)
- Register Native functions in built in namespace when they come from sidecar (prestodb/rfcs#41)

Since these use cases have a lot of overlap, we can introduce an abstract class in order to reduce duplicate logic.

## Description

## Motivation and Context

## Impact

Pull Request resolved: prestodb#25661

Test Plan:
<!---Please fill in how you tested your change-->

## Contributor checklist

- [ ] Please make sure your submission complies with our [contributing guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md), in particular [code style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style) and [commit standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards).
- [ ] PR description addresses the issue accurately and concisely.  If the change is non-trivial, a GitHub Issue is referenced.
- [ ] Documented new properties (with its default value), SQL syntax, functions, or other functionality.
- [ ] If release notes are required, they follow the [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines).
- [ ] Adequate tests were added if applicable.
- [ ] CI passed.

## Release Notes
Please follow [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines) and fill in the release notes below.

```
== RELEASE NOTES ==

General Changes
* ...
* ...

Hive Connector Changes
* ...
* ...
```

If release note is NOT required, use:

```
== NO RELEASE NOTE ==
```

Rollback Plan:

Differential Revision: D79301760

Pulled By: kevintang2022
* Function with only java implementation: constant fold, existing behavior
* Function with only C++ implementation: no constant fold
* Function with java and c++ implementation: constant fold, existing behavior
* Function with only SQL implementation: use sql implementation, existing behavior

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would presume SQL inlining happens before constant folding. They are orthogonal concerns as far as I understand. SQL inlining issues are handled by https://github.com/prestodb/rfcs/pull/40/files only.
Can you confirm ?

But there is a correlation in that :

  • If there is a C++ function for a SQL function, then we want to pick up the C++ function. So if the function was originally constant folded it will not anymore.
  • If the SQL function expands to only java functions then it will get constant folded.
  • For new SQL functions that expand to new C++ functions, these are not constant folded.

That's about the impact to the current RFC.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This RFC is about picking which FunctionHandle to use. If there is a SQL implementation and a C++ implementation, then the C++ function handle will be used. If there is a Java implementation, then the java function handle will take priority. By doing this, it creates the side effect of which functions are constant folded or not, and this is determined by the implementation type.

This code ponter shows that Java implemenetation are constant folded, but C++ implementations are not.
https://github.com/prestodb/presto/blob/8041eb5e3ae0dc6e6a656d3e68d90ed8e1fda099/presto-spi/src/main/java/com/facebook/presto/spi/function/FunctionImplementationType.java#L16


* If we choose to have separate namespaces for built in functions from coordinator and functions from sidecar, then this would require extra changes on the worker code. Currently the worker code is using one prefix for every function, and this prefix is configurable. However, you cannot set a different prefix for different functions coming from the sidecar; they would all be the same prefix.

* Since adding more worker changes adds extra complexity and confusion, we decided to use presto.default. prefix for every function from sidecar that is registered.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then how are you handling conflicts then ? I do see it depends on the code you have suggested below. But can you describe what the desired outcome is in a functional manner here ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functional outcome is that

  • it keeps constant folding as an existing behavior for functions that are currently constant folded
  • uses C++ implementation if a function is not currently being constant folded
  • allows users to take advantage of this optimization without having to specifiy fully qualified function name

@kevintang2022 kevintang2022 force-pushed the native-built-in-functions branch from 9af6017 to 0af3a74 Compare August 22, 2025 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
from:Meta PRs from Meta
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants