You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Any BQT class that uses OverrideTypeProvider to wrap a non-String type will fail when GenericRecord format is used to write records to BQ. For example:
[info] at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:326)
[info] at org.apache.beam.sdk.io.gcp.bigquery.AvroRowWriter.write(AvroRowWriter.java:58)
[info] at org.apache.beam.sdk.io.gcp.bigquery.WriteBundlesToFiles.processElement(WriteBundlesToFiles.java:247)
[info] ...
[info] Cause: java.lang.ClassCastException: value 31 (a com.spotify.scio.example.NonNegativeInt) cannot be cast to expected type long at MyRecord.i
[info] at org.apache.avro.path.TracingClassCastException.summarize(TracingClassCastException.java:79)
[info] at org.apache.avro.path.TracingClassCastException.summarize(TracingClassCastException.java:30)
[info] at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:84)
[info] at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:323)
[info] at org.apache.beam.sdk.io.gcp.bigquery.AvroRowWriter.write(AvroRowWriter.java:58)
[info] at org.apache.beam.sdk.io.gcp.bigquery.WriteBundlesToFiles.processElement(WriteBundlesToFiles.java:247)
[info] at org.apache.beam.sdk.io.gcp.bigquery.WriteBundlesToFiles$DoFnInvoker.invokeProcessElement(Unknown Source)
[info] at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:212)
[info] at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:186)
[info] at org.apache.beam.runners.core.SimplePushbackSideInputDoFnRunner.processElementInReadyWindows(SimplePushbackSideInputDoFnRunner.java:88)
I think this is theoretically easy to fix (remove the .toString) but I think it will break existing implementations of OverrideTypeProvider, i.e. Elitzur. May be simpler for these users to just fall back to TableRow format.
Any BQT class that uses OverrideTypeProvider to wrap a non-String type will fail when GenericRecord format is used to write records to BQ. For example:
will fail with a class cast exception like:
This is because
toAvroInternal
converts all overridden types to String: https://github.com/spotify/scio/blob/v0.14.14/scio-google-cloud-platform/src/main/scala/com/spotify/scio/bigquery/types/ConverterProvider.scala#L174 . I think this was just copied from thetoTableRow
behavior, where it works fine because JSON format supports stringified everything, but Avro is more strict; the convertedavroSchema
correctly expects an Integer value.The workaround is to fall back to TableRow format.
The text was updated successfully, but these errors were encountered: