-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ground Truth and Annotation Score tweeks #1
Ground Truth and Annotation Score tweeks #1
Conversation
PaulHax
commented
Jun 12, 2024
•
edited
Loading
edited
if image.size != transformed_img.size: | ||
# Resize so pixel-wise annotation similarity score works | ||
transformed_img = transformed_img.resize(image.size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What impact does a downsample with whatever filtering method PIL picks have on the object detection model output. Can we ask the perturber to match our input image size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IT depends on the object dect model, it can be significant since due the change of the embeddings.
One thing to note is that we need to group the images by image size for batching to work.
Can we ask the perturber to match our input image size?
In principle we want to check the perturbation in the model meaning that we should not resize so we can see how the downsampling affects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case where the transformed image is a different resolution should we resize to original or keep the transformed image size? We could resize just the annotations before passing to the scorer
.
(Aside: I see we are resizing all images passed to the embedding plot, not the object detection model. https://github.com/Kitware/nrtk-explorer/blob/main/src/nrtk_explorer/app/embeddings.py#L130 )
e54cdf8
to
573b474
Compare
@vicentebolea This is worth a review now 🙌 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many changes looks good! A gew comments, questions and requests. Before moving forward lets bring this to Seb|Brian to see if we want to follow this direction.
A few things:
- Computing all the annotations at once will decrease perf, maybe not a big deal but something to keep in mind.
- This workflow of PR to my fork my be a complicated since what if I make a change to my branch, the rebasing can be difficult. I propose to merge what I had in the original PR and then continue this conversation in a new PR in the main repo.
SourceImageId = str | ||
TransformedImageId = str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ill prefer to use primitive types when possible (like a str), it simplifies code, if we need to parametrize types we can do it later when times comes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a Domain Modeling trick: https://www.thoughtworks.com/en-us/insights/blog/microservices/domain-modeling-algebraic-data-types-pt2
Its useful because if you pass a DatasetId
to image_id_to_result_id
that would be a semanic problem. In Typescript you can enfoce this with "branded" types but I don't know how to do that with MyPy. But we don't have to do this.
Annotation = dict # in COCO format | ||
Annotations = list[Annotation] | ||
ImageId = str | ||
AnnotatedImage = tuple[ImageId, Annotations] | ||
AnnotatedImages = list[AnnotatedImage] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where it is even more important to document data structures with domain types. Consiter one of the nrtk
toolkits score
function argument: Sequence[Sequence[Tuple[AxisAlignedBoundingBox, Dict[Hashable, float]]]]
What should I put in Hashable
part of the Dict
? Cat ID. cat name, my own random ID?
) | ||
output = [find_prediction(id) for id in image_ids] | ||
# mypy wrongly thinks output's type is list[list[tuple[str, dict]]] | ||
return output # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This defeats the point of defining the AnnotatedImages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think outside the function, its return signature is AnnotatedImages, but this gets around a MyPy "bug" (I think.)
# order output by paths order | ||
find_prediction = lambda id: next( | ||
prediction for prediction in predictions if prediction[0] == id | ||
) | ||
output = [find_prediction(id) for id in image_ids] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is a refactor, however I do not see an if it simplifies or it is a clear perf improvement. In fact it slightly obfuscate by adding a level of indirection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like standard Functional Programming to me. If we don't like that, then we should be flattening our 2D lists with 2 for loops rather than reduce
. With the old code, it read like a compare all to all algorthim with the double for loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change this to make an ID to Prediction map, then a reorder pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decided to just return the ID to Predictions Dict, and tweek the downstream code.
predictions_by_image_id = {
image_id: predictions
for batch in predictions_in_baches
for image_id, predictions in batch
}
return predictions_by_image_id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like standard Functional Programming to me. If we don't like that, then we should be flattening our 2D lists with 2 for loops rather than reduce.
My comment was about readability which is of course a subjective topic. I consider that using a lambda function defined lines above takes a couple of seconds more of making sense than the conditional inside the for loop.
Decided to just return the ID to Predictions Dict, and tweek the downstream code.
@PaulHax The idea is that we pass a list of ids and we are returned a list of the annotations for each of this id in the order of the ids.
Lets take this refactor to another PR where we can discuss this.
if image.size != transformed_img.size: | ||
# Resize so pixel-wise annotation similarity score works | ||
transformed_img = transformed_img.resize(image.size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IT depends on the object dect model, it can be significant since due the change of the embeddings.
One thing to note is that we need to group the images by image size for batching to work.
Can we ask the perturber to match our input image size?
In principle we want to check the perturbation in the model meaning that we should not resize so we can see how the downsampling affects.
src/nrtk_explorer/library/dataset.py
Outdated
@@ -24,3 +25,20 @@ class Dataset(TypedDict): | |||
categories: List[DatasetCategory] | |||
images: List[DatasetImage] | |||
annotations: List[DatasetAnnotation] | |||
|
|||
|
|||
def loadDataset(path: str) -> Dataset: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pep8 naming for functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed 🧹
return scores | ||
|
||
actual, predicted, dataset_ids = zip(*has_predictions) | ||
score_output = ClassAgnosticPixelwiseIoUScorer().score(actual, predicted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why moving to ClassAgnosticPixelWiseIOUScorer()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because COCO scorer errored when model predicts category not in the JSON.
src/nrtk_explorer/library/dataset.py
Outdated
datasetJson: Dataset = {"categories": [], "images": [], "annotations": []} | ||
_datasetPath: str = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pep8 naming here here and I would remove the _.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed 🧹
src/nrtk_explorer/library/dataset.py
Outdated
_datasetPath: str = "" | ||
|
||
|
||
def getDataset(path: str) -> Dataset: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a force arg here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? string_path = str(path)
?
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 4.5.2 to 4.5.3. - [Release notes](https://github.com/vitejs/vite/releases) - [Changelog](https://github.com/vitejs/vite/blob/v4.5.3/packages/vite/CHANGELOG.md) - [Commits](https://github.com/vitejs/vite/commits/v4.5.3/packages/vite) --- updated-dependencies: - dependency-name: vite dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com>
- Toggle ground truth/predictions annotations for source and transformation images. - Compute COCO score (NRTK) instead of embeddings cartesian distance. - Add unit tests for the coco_utils.py newly module.
Avoids error when Object Detection Model outputs category that is not in COCO JSON.
Annotation similarity scoring requires the transformed image to be the same size as the original image.
Was showing repetitive category ID for object detection model annotations.
Shifting tooltip if out of window was not enough when table size was small, and tooltip would clip under the table footer.
and trame_utils
Always compute score against ground truth.
superseded by Kitware#76 |