[SPARK-53367][PYTHON][SQL] add int to decimal coercion for Arrow UDFs #52117

benrobby · 2025-08-25T15:11:57Z

What changes were proposed in this pull request?

this is a followup to [SPARK-52821][PYTHON] add int->DecimalType pyspark udf return type coercion #51538, it now also adds support for integer to decimal type coercion to udfs with useArrow=True when spark.sql.legacy.execution.pythonUDF.pandas.conversion.enabled=False.
For this, we now forwards the existing spark conf spark.sql.execution.pythonUDF.pandas.intToDecimalCoercionEnabled from the worker to the ArrowBatchUDFSerializer and then to the Python->Arrow converter.

Why are the changes needed?

Python UDFs with useArrow=True and spark.sql.legacy.execution.pythonUDF.pandas.conversion.enabled=False do not support type coercion from int to DecimalType if the target precision of the DecimalType is too low:

@udf(returnType=DecimalType(2, 1), useArrow=True)
def test:
  return 1
spark.range(1,2,1,1).select(test(col('id'))).display() 
# expected: (Decimal) 1.0
# actual:   
File "/deps/pyspark/sql/conversion.py", line 314, in convert_decimal
    assert isinstance(value, decimal.Decimal)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

Does this PR introduce any user-facing change?

No, spark.sql.execution.pythonUDF.pandas.intToDecimalCoercionEnabled is still off by default, so this is not a behavior change.

How was this patch tested?

added unit tests

Was this patch authored or co-authored using generative AI tooling?

No

benrobby · 2025-08-25T15:13:33Z

@HyukjinKwon @asl3 @zhengruifeng pls take a look

[SPARK-53367][PYTHON] add int to decimal coercion to Arrow UDFs

48f0fa1

github-actions bot added SQL CORE PYTHON labels Aug 25, 2025

benrobby changed the title ~~[SPARK-53367][PYTHON] add int to decimal coercion for Arrow UDFs~~ [SPARK-53367][PYTHON][SQL] add int to decimal coercion for Arrow UDFs Aug 25, 2025

HyukjinKwon approved these changes Aug 25, 2025

View reviewed changes

zhengruifeng approved these changes Aug 26, 2025

View reviewed changes

asl3 approved these changes Aug 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-53367][PYTHON][SQL] add int to decimal coercion for Arrow UDFs #52117

[SPARK-53367][PYTHON][SQL] add int to decimal coercion for Arrow UDFs #52117

benrobby commented Aug 25, 2025 •

edited

Loading

Uh oh!

benrobby commented Aug 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

[SPARK-53367][PYTHON][SQL] add int to decimal coercion for Arrow UDFs #52117

Are you sure you want to change the base?

[SPARK-53367][PYTHON][SQL] add int to decimal coercion for Arrow UDFs #52117

Conversation

benrobby commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

benrobby commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

benrobby commented Aug 25, 2025 •

edited

Loading

benrobby commented Aug 25, 2025 •

edited

Loading