Issue in the cell for the glueing #227

burnout87 · 2025-01-15T14:04:16Z

As noticed by @ferrigno , there is probably an issue with the decode function of the ImageDataProduct and the NumpyDataProduct class

https://gitlab.renkulab.io/astronomy/mmoda/jemx-expert/-/blob/main/test_products.ipynb?ref_type=heads

The text was updated successfully, but these errors were encountered:

burnout87 · 2025-01-15T15:27:11Z

After some investigation, I noticed that the error:

UnpicklingError: invalid load key, '\xfc'.

appears when executing the code:

ImageDataProduct.decode(result_image)

where result_image is read from the outputs of the notebook execution.Could it be related to the processing performed by the operations of glueing in the last cell (added by the backend nb2workflow)?

@dsavchenko maybe you have an idea of this

dsavchenko · 2025-01-15T23:11:25Z

I changed the test a bit (see PR), to demonstrate what's going on.
While it works in the first case, it fails in the second.

The difference is that in the second case, the encoded object is a string (kind of double-encoded in a sense). So it ends up being processed through literal_to_json. And by simple coincidence, there is 'nan' substring in the encoded representation of the binary data. It gets modified, then unsurprisingly the data is corrupted.

I don't see a simple and clear solution, the overall encoding-decoding is convoluted, and overall, I don't really understand why it is made like this, with all this pickling etc.
Changing it may have unpredicted consequences and compatibility issues.
So we really need to think well...

ferrigno · 2025-01-16T08:11:38Z

It seems to affect only products retrieved through the nb2workflow run function, as files downloaded from the front end are correct. Maybe @volodymyrss has a historical overall view.

Please notice that also the rmf file in the spectral extraction has an issue related to encoding and decoding. Most probably due to the peculiar nature of the compression used in the matrix or maybe just for the presence of NaN.

volodymyrss · 2025-01-16T08:21:25Z

Is it really for this repository?

The complexity with encoding in odaapi is old, but the complexity is already with astropy fits serialization, there were astropy bugs with variable size fits tables.

I think as long as there are tests, refactoring is safe.
Maybe we can add a bit more tests to be safer. And refactor a bit as needed.

burnout87 self-assigned this Jan 15, 2025

burnout87 transferred this issue from oda-hub/oda_api Jan 15, 2025

burnout87 linked a pull request Jan 15, 2025 that will close this issue

dedicated test and file #226

Open

burnout87 changed the title ~~Issue in decoding~~ Issue in the cell for the glueing Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue in the cell for the glueing #227

Issue in the cell for the glueing #227

burnout87 commented Jan 15, 2025 •

edited

Loading

burnout87 commented Jan 15, 2025 •

edited

Loading

dsavchenko commented Jan 15, 2025

ferrigno commented Jan 16, 2025

volodymyrss commented Jan 16, 2025

Issue in the cell for the glueing #227

Issue in the cell for the glueing #227

Comments

burnout87 commented Jan 15, 2025 • edited Loading

burnout87 commented Jan 15, 2025 • edited Loading

dsavchenko commented Jan 15, 2025

ferrigno commented Jan 16, 2025

volodymyrss commented Jan 16, 2025

burnout87 commented Jan 15, 2025 •

edited

Loading

burnout87 commented Jan 15, 2025 •

edited

Loading