-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor image/stamp workflow #424
base: main
Are you sure you want to change the base?
Conversation
e58012d
to
c145f0c
Compare
c145f0c
to
1d7fefc
Compare
acbac83
to
9f49df7
Compare
The latest commit aligns photon objects in the photon pooling pipeline with their positions in the original pipeline. |
Almost there, but blocked by GalSim-developers/GalSim#1284. |
daae77c
to
7290a11
Compare
bdaadae
to
b786cb0
Compare
This is finally open for review -- all comments welcome. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks William. I have a bunch of comments, but this is definitely a good start. If it would be helpful to talk through any of this on a zoom call, let me know.
config/imsim-config.yaml
Outdated
# If using photon pooling, FFT and photon objects are batched separately. | ||
# There will probably be few enough FFT objects to do in a single batch. | ||
# Bright photon objects are treated in all photon batches, but at 1/nbatch_photon of | ||
# their flux. Faint photonobjects are placed at their full flux in random batches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# their flux. Faint photonobjects are placed at their full flux in random batches. | |
# their flux. Faint photon objects are placed at their full flux in random batches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting this. Fixing in the next commit.
config/imsim-config.yaml
Outdated
# These parameters only affect LSST_PhotonPoolingImage images. | ||
# The batch numbers can be changed if desired. | ||
nbatch_fft: 1 | ||
nbatch_photon: 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be the same parameter as the simple nbatch. We can just document that with photon pooling, nbatch refers only to the photon-shooting objects. Then nbatch_fft can be an optional parameter that defaults to 1. (Maybe not even worth mentioning here if most users wouldn't change it?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this. Nice to keep the number of parameters down. nbatch_photon
will be no more, and the photon pooling will be controlled directly with nbatch
.
imsim/lsst_image.py
Outdated
|
||
class LSST_ImageBuilder(LSST_ImageBuilderBase): | ||
"""This is mostly the same as the GalSim "Scattered" image type. | ||
So far the only change is in the sky background image.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc is probably obsolete. There are quite a few differences now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to change this. Can you suggest something more suitable?
imsim/lsst_image.py
Outdated
'x' : { 'type' : 'Random' , 'min' : xmin , 'max' : xmax }, | ||
'y' : { 'type' : 'Random' , 'min' : ymin , 'max' : ymax } | ||
} | ||
set_config_image_pos(config, base) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than making this a free function in a different file, this should probably be a private method of the base class, which both subclasses can call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, doing this.
imsim/lsst_image.py
Outdated
base["stamp"]["photon_ops"] = [] | ||
photon_ops = base["stamp"]["photon_ops"] | ||
shift_op = {'type': 'Shift'} | ||
shift_index = next((index for (index, d) in enumerate(photon_ops) if d["type"] == "Shift"), None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love this implementation. I get why something like this is now required. But I think I'd rather have RubinOptics and RubinDiffractionOptics just have an optional parameter to apply this shift or not. Then when running this class, base[image_pos
] is defined and they can apply the shift based on that.
In the photon pooling class, once we've pooled all the photons, we should set the object-specific values (image_pos, sky_pos, maybe others) that apply to a particular object to None
. Then the RubinOptics and RubinDiffractionOptics photon ops would see that the image_pos isn't defined, so not apply any shift.
This feels to me more elegant and possibly more efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should now be done -- see RubinOptics and RubinDIffractionOptics. If an optional shift_photons = True
is passed, the shift is performed.
tests/test_image.py
Outdated
}, | ||
"psf": {"type": "Convolve", "items": [{"type": "Gaussian", "fwhm": 0.3}]}, | ||
"stamp": { | ||
"type": "LSST_Silicon", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the new stamp type when image_type is LSST_PhotonPoolingImage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, photon pooling now needs to use the new stamp type rather messing with base
. The test sets the correct stamp type.
tests/test_image.py
Outdated
|
||
def test_lsst_image_photon_pooling_pipeline(): | ||
"""Check that LSST_PhotonPoolingImage batches objects as expected and renders objects at the correct positions.""" | ||
run_lsst_image("LSST_PhotonPoolingImage") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we probably want a few additional tests in this pipeline. E.g. that the FFT and faint objects only got drawn once. And the others got drawn nbatch times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've written a set of tests for the partitioning and batching process. As mentioned above, it also confirms that total flux is conserved for photon objects,
tests/test_image.py
Outdated
help="Similar to -k of pytest / unittest: restrict tests to tests starting with the specified prefix.", | ||
default="test_", | ||
) | ||
args = parser.parse_args() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, when working on stuff, I usually just put the name of the test function I'm working on at the start of this name == '__main__'
block, followed by quit(). Then remove it before committing. Seems easier than rolling all of this argparse stuff, but 🤷.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also done in test_diffraction_fft.py, but I'm happy to change it if you'd like me to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess if you find it useful, go ahead and keep it. But if you were just copying from that file, I don't think this is necessary to keep.
tests/test_photon_ops.py
Outdated
expected_x_pic_center = 564.5 | ||
expected_y_pic_center = -1431.4 | ||
expected_x_pic_center = -989.5971378245167 | ||
expected_y_pic_center = -3840.3512012842157 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these (and subsequent tests) changing so much? This isn't close.
Regression tests are supposed to make sure we don't change the functionality of existing code. I know the implementation of these photon ops changed, but I think we want the tests to remain the same or at least very very close. What happened here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still a big issue. I'm 99% convinced the error is in the original test itself. My current best guess is that with the old code it may not have been internally self-consistent around the WCS used, while the new XyToV
should enforce self-consistency. But it still needs to be looked at.
@@ -52,7 +53,8 @@ def create_test_config(): | |||
}], | |||
}, | |||
"bandpass": galsim.Bandpass('LSST_r.dat', wave_type='nm'), | |||
"wcs": galsim.PixelScale(0.2), | |||
"wcs": wcs, | |||
"current_image": galsim.Image(1024, 1024, wcs=wcs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this change required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change to the wcs
field isn't actually necessary, it looks. The addition of current_image
is needed for deserialize_rubin_optics
, deserialize_rubin_diffraction_optics
and deserialize_rubin_diffraction
which uses it to get img_wcs
, required by the new version of XyToV
.
The latest set of commits should address most of the comments above. I'll respond to those individually. A couple do still need to be looked at. |
Are you waiting on me for anything here? I thought you still had some more comments to address, so I was waiting for that to be done. But if it would be helpful for me to review this again at this point, let me know. |
Not just yet -- I have a couple more commits on the way with fixes and tests for the batching. I'll push these later today or tomorrow, and then I think that should be everything ready for review again. |
46461fb
to
eeb67dc
Compare
I think this is ready for re-review now, though CI is failing with an error in |
I think there is actually one more commit incoming. I've realised that in edge cases where nominally bright photon objects' |
OK, that should be it -- code open for review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks William. I think we're close.
I'd still like to have a separate PR that just changes the tests of the expected centers. I think the way to do that is to change the test to mostly your new code, but when calling (the old) XyToV, use sky_pos = img_wcs.toWorld(image_pos)
and local_wcs = img_wcs.local(image_pos)
. I'm not sure what image_pos is exactly, but hopefully you can follow Geri's code to see what the implicit image_pos is there.
imsim/photon_pooling.py
Outdated
offset_adjustment: a PositionD by which to offset each object | ||
""" | ||
for stamp in stamps: | ||
offset = offset_adjustment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should just call this parameter offset
from the start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Doing this now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And now obsolete with the big simplification in offsetting the photons by stamp_center. I've removed the two static methods in photon_pooling.py dealing with the offset and it's now dealt with in stamp.py just using stamp_center.
imsim/photon_pooling.py
Outdated
Returns: | ||
images: Tuple of the output stamps. In normal PhotonPooling usage these will actually be | ||
PhotonArrays to be processed into images after they have been pooled following this step. | ||
current_vars: Tuple of variables for each stamp (noise etc). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be the current noise variance in each image. var=variance, not variable.
It's not really relevant for most imsim usage, but if you draw a RealGalaxy, and especially if you then whiten it, the image gets some noise from that, so this mechanism helps GalSim from adding more noise than requested when adding the rest of the image noise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out. Is the actual usage of current_vars
ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Just the doc string is wrong.
imsim/photon_pooling.py
Outdated
batches = [ | ||
[dataclasses.replace(obj, phot_flux=np.floor(obj.phot_flux / nbatch)) for obj in phot_objects] | ||
for _ in range(nbatch)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nifty (simpler?) way to get the total flux to come out right:
batches = [ | |
[dataclasses.replace(obj, phot_flux=np.floor(obj.phot_flux / nbatch)) for obj in phot_objects] | |
for _ in range(nbatch)] | |
batches = [ | |
[dataclasses.replace(obj, | |
phot_flux=(obj.phot_flux*(i+1))//nbatch - (obj.phot_flux*i)//nbatch) | |
) for obj in phot_objects] | |
for i in range(nbatch)] |
Then you don't need the block below this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice -- thanks! Putting this in now.
imsim/stamp.py
Outdated
def build_obj(stamp_config, base, logger): | ||
"""Precompute all data needed to determine the rendering mode of an | ||
object.""" | ||
stamp_type = stamp_config.get('type','LSST_Photons') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to have a default here. Rather we should make sure the user has correctly specified this type. So if stamp_config.get('type', None) != 'LSST_Photons':
... then raise some kind of explanatory errror.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks -- doing this now.
imsim/stamp.py
Outdated
|
||
max_flux_simple = config.get('max_flux_simple', 100) | ||
faint = self.nominal_flux < max_flux_simple | ||
bandpass = base['bandpass'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D'oh! Thanks for spotting -- fixed.
imsim/stamp.py
Outdated
img_pos = base["image_pos"] | ||
obj_offset = base["stamp_offset"] | ||
image.photons.x = image.photons.x + img_pos.x - obj_offset.x | ||
image.photons.y = image.photons.y + img_pos.y - obj_offset.y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all this offset stuff could be simpler. Here you add img_pois - stamp_offset. Then later you add 0.5 for even sized images. So the net adjustment is
img_pos - stamp_offset + (0.5 if size%2==0)
I'm pretty sure this combination is equivalent to base["stamp_center"] based on the calculation of stamp_offset here. So I think you can just add stamp_center here and skip the later adjustments. Which honestly makes sense. These photon arrays are meant to be relative to the image center, which here is just the stamp center.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice call. I've just tested and this gives identical output. Making this change now.
imsim/photon_ops.py
Outdated
# positions the photons before they are pooled to be processed together. | ||
if self.shift_photons and self.image_pos is not None: | ||
photon_array.x += self.image_pos.x | ||
photon_array.y += self.image_pos.y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per my other comment, these should presumably also shift by stamp_center, not image_pos. Likely not a big difference, but might as well use the correct shift.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also be done at the end of applyTo? e.g.
Lines 111 to 112 in a4c2d97
photon_array.x -= self.image_pos.x | |
photon_array.y -= self.image_pos.y |
main
as it is now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I think so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the last thing on my list. It means changing the API to RubinOptics and RubinDiffractionOptics. I think normal use is always through photon op registrations via the photon_op_type
decorator and deserialize_*
functions, which wouldn't need to change, but at least the tests in test_photon_ops.py do directly init the photon operators themselves. Is that ok to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we preserve the user-facing code (i.e. how the config file specifies things), then it's ok to make changes to the implementation, even if these are technically API changes. But perhaps @jchiang87 or @cwwalter might want to weigh in about how strictly we want to preserve backwards-compatibility of the Python classes/functions.
tests/test_image.py
Outdated
help="Similar to -k of pytest / unittest: restrict tests to tests starting with the specified prefix.", | ||
default="test_", | ||
) | ||
args = parser.parse_args() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess if you find it useful, go ahead and keep it. But if you were just copying from that file, I don't think this is necessary to keep.
imsim/photon_ops.py
Outdated
|
||
The transform takes 2 vectors of shape (n,) and returns | ||
a column major vector of shape (n,3). | ||
""" | ||
|
||
def __init__(self, local_wcs, icrf_to_field, sky_pos): | ||
def __init__(self, local_wcs, icrf_to_field, img_wcs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this doesn't need local_wcs anymore. If we're breaking the API (with the change in the last parameter), might as well clean it up and remove local_wcs as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. This also leaves some of the photon_velocity
methods in RubinOptics
, RubinDiffraction
and RubinDiffractionOptics
with unnecessary local_wcs
and, now I look at them, also rng
. Would you prefer those left alone or removed as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well clean up all the obsolete API. If any of them seem like user-facing code, we should make this a deprecation rather than completely break old code. But I think most of these functions are internal, not user facing, so it should be ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to mention this in the meeting! Yes, everything looks to be internal to the photon ops themselves. rng
can't be removed though, as it turns out, or it breaks RubinDiffractionOptics. Otherwise, tidied up as requested.
Assuming tests pass, that should be everything above. I've included changes to I'll move on to seeing if I can make the old photon ops tests match the new ones as suggested. |
I've just created PR #484 which corrects the photon op tests to be self-consistent. There were actually a couple of inconsistencies in the tests here too, but working through them now means everything should be in alignment pre- and post-photon pooling. I hope this means that once #484 is merged in, this can be, too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi William,
In comparing how this branch works wrt the code in main, I saw a few items that may need to be addressed.
# If using LSST_Image (non-photon pooling), nbatch is the overal number of batches in | ||
# which to draw the objects. | ||
# The default number of batches is 100, but you can change it if desired. | ||
nbatch: 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to clarify that nbatch
is also used by the photon pooling code, not just by LSST_Image
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks -- doing this now.
imsim/photon_pooling.py
Outdated
imview._shift(-center) # equiv. to setCenter(), but faster | ||
imview.wcs = PixelScale(1.0) | ||
if imview.dtype in (np.float32, np.float64): | ||
sensor.accumulate(photons, imview, imview.center) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there ever a situation where we would want to set resume=True
here, e.g., similar to this case in galsim?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure -- it looks like it can boost performance somewhat when using a SiliconSensor
by skipping an initial calculation in the accumulate
method. I don't think there's any disadvantage to using resume=True
, but it might be worthwhile discussing this further in tomorrow's call to be certain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making this change now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, though the tests pass it looks like my implementation has problems in practice as it's a different image view being used each time. I'm trying a fix now by creating the image view outside the loop over photon pools and reusing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed with the latest commit.
imsim/stamp.py
Outdated
image=image, | ||
wcs=base['wcs'], | ||
sensor=NullSensor(), # Prevent premature photon accumulation | ||
photon_ops=psfs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be
photon_ops=photon_ops,
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it should be photon_ops
, but the code above which creates the initial photon_ops
list from the config is incorrect -- thanks for noticing something's not right here. It should be contain the PSF and the BandpassRatio photon op (only if the SED needs to be reweighted), which need to be dealt with here. All the other photon ops are dealt with later in the pipeline and applied to the whole photon batch. I'm fixing this now.
save_photons=True) | ||
stamp_center = base['stamp_center'] | ||
image.photons.x = image.photons.x + stamp_center.x | ||
image.photons.y = image.photons.y + stamp_center.y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The non-photon pooling code sets
base['realized_flux'] = image.added_flux
at the corresponding point in the code so that those values appear in the output.truth
files. What should it be set to here so that the equivalent info is written out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a tricky one. Per object quantities are diluted with the use of photon pools. If we did the same thing here in the stamp, we'd firstly only be dealing with an object flux which is a fraction ~1/nbatch of the total flux, complicated by different pools having different integer values of the flux in order to conserve the total. Furthermore it might have been possible to track and sum the realised flux per object across the pools, but the accumulation now happens at the pool level. We can tell how much of the pool's flux is added to the image, but the information about which object contributed which photon is not retained at that point. I'm not sure if someone else has a suggestion or is aware of something that would help. I'll have a think about this but I think we might need to chat about this in tomorrow's call.
…mp is being used.
…ine. Photon pooling translates relative to full_image.center just before accumulation instead.
…he functions there its static methods.
…beginning. Previously missed. Ensure conservation of photon object fluxes during batching by adding the modulo to random batches.
…ST_ImageBuilderBase.
…than the (0,0) position.
…ther photon pooling tests.
Co-authored-by: Mike Jarvis <michael@jarvis.net>
…thods in photon_ops.py.
… and match the new results.
bf6cb43
to
72cc687
Compare
…r performance with SiliconSensors.
…ton objects across pools for an incident_flux column.
…ormance in photon accumulation.
No description provided.