Ground Truth and Annotation Score tweeks #1

PaulHax · 2024-06-12T16:58:57Z

PaulHax · 2024-06-13T00:20:16Z

src/nrtk_explorer/app/transforms.py

+            if image.size != transformed_img.size:
+                # Resize so pixel-wise annotation similarity score works
+                transformed_img = transformed_img.resize(image.size)


What impact does a downsample with whatever filtering method PIL picks have on the object detection model output. Can we ask the perturber to match our input image size?

IT depends on the object dect model, it can be significant since due the change of the embeddings.

One thing to note is that we need to group the images by image size for batching to work.

Can we ask the perturber to match our input image size?

In principle we want to check the perturbation in the model meaning that we should not resize so we can see how the downsampling affects.

In the case where the transformed image is a different resolution should we resize to original or keep the transformed image size? We could resize just the annotations before passing to the scorer.

(Aside: I see we are resizing all images passed to the embedding plot, not the object detection model. https://github.com/Kitware/nrtk-explorer/blob/main/src/nrtk_explorer/app/embeddings.py#L130 )

PaulHax · 2024-06-17T21:11:10Z

@vicentebolea This is worth a review now 🙌

vicentebolea

Many changes looks good! A gew comments, questions and requests. Before moving forward lets bring this to Seb|Brian to see if we want to follow this direction.

A few things:

Computing all the annotations at once will decrease perf, maybe not a big deal but something to keep in mind.
This workflow of PR to my fork my be a complicated since what if I make a change to my branch, the rebasing can be difficult. I propose to merge what I had in the original PR and then continue this conversation in a new PR in the main repo.

vicentebolea · 2024-06-19T22:44:43Z

src/nrtk_explorer/app/image_ids.py

+SourceImageId = str
+TransformedImageId = str


Ill prefer to use primitive types when possible (like a str), it simplifies code, if we need to parametrize types we can do it later when times comes.

This is a Domain Modeling trick: https://www.thoughtworks.com/en-us/insights/blog/microservices/domain-modeling-algebraic-data-types-pt2

Its useful because if you pass a DatasetId to image_id_to_result_id that would be a semanic problem. In Typescript you can enfoce this with "branded" types but I don't know how to do that with MyPy. But we don't have to do this.

vicentebolea · 2024-06-19T22:48:59Z

src/nrtk_explorer/library/object_detector.py

+Annotation = dict  # in COCO format
+Annotations = list[Annotation]
+ImageId = str
+AnnotatedImage = tuple[ImageId, Annotations]
+AnnotatedImages = list[AnnotatedImage]


This is where it is even more important to document data structures with domain types. Consiter one of the nrtk toolkits score function argument: Sequence[Sequence[Tuple[AxisAlignedBoundingBox, Dict[Hashable, float]]]] What should I put in Hashable part of the Dict? Cat ID. cat name, my own random ID?

vicentebolea · 2024-06-19T22:51:22Z

src/nrtk_explorer/library/object_detector.py

+        )
+        output = [find_prediction(id) for id in image_ids]
+        # mypy wrongly thinks output's type is list[list[tuple[str, dict]]]
+        return output  # type: ignore


This defeats the point of defining the AnnotatedImages

I think outside the function, its return signature is AnnotatedImages, but this gets around a MyPy "bug" (I think.)

vicentebolea · 2024-06-19T22:57:51Z

src/nrtk_explorer/library/object_detector.py

+        # order output by paths order
+        find_prediction = lambda id: next(
+            prediction for prediction in predictions if prediction[0] == id
+        )
+        output = [find_prediction(id) for id in image_ids]


This change is a refactor, however I do not see an if it simplifies or it is a clear perf improvement. In fact it slightly obfuscate by adding a level of indirection.

Looks like standard Functional Programming to me. If we don't like that, then we should be flattening our 2D lists with 2 for loops rather than reduce. With the old code, it read like a compare all to all algorthim with the double for loop.

I'll change this to make an ID to Prediction map, then a reorder pass.

Decided to just return the ID to Predictions Dict, and tweek the downstream code.

predictions_by_image_id = { image_id: predictions for batch in predictions_in_baches for image_id, predictions in batch } return predictions_by_image_id

Looks like standard Functional Programming to me. If we don't like that, then we should be flattening our 2D lists with 2 for loops rather than reduce.

My comment was about readability which is of course a subjective topic. I consider that using a lambda function defined lines above takes a couple of seconds more of making sense than the conditional inside the for loop.

Decided to just return the ID to Predictions Dict, and tweek the downstream code.

@PaulHax The idea is that we pass a list of ids and we are returned a list of the annotations for each of this id in the order of the ids.

Lets take this refactor to another PR where we can discuss this.

vicentebolea · 2024-06-19T23:05:31Z

src/nrtk_explorer/app/transforms.py

+            if image.size != transformed_img.size:
+                # Resize so pixel-wise annotation similarity score works
+                transformed_img = transformed_img.resize(image.size)


IT depends on the object dect model, it can be significant since due the change of the embeddings.

One thing to note is that we need to group the images by image size for batching to work.

Can we ask the perturber to match our input image size?

In principle we want to check the perturbation in the model meaning that we should not resize so we can see how the downsampling affects.

vicentebolea · 2024-06-19T23:06:40Z

src/nrtk_explorer/library/dataset.py

@@ -24,3 +25,20 @@ class Dataset(TypedDict):
    categories: List[DatasetCategory]
    images: List[DatasetImage]
    annotations: List[DatasetAnnotation]
+
+
+def loadDataset(path: str) -> Dataset:


pep8 naming for functions

vicentebolea · 2024-06-19T23:19:52Z

src/nrtk_explorer/library/coco_utils.py

+        return scores
+
+    actual, predicted, dataset_ids = zip(*has_predictions)
+    score_output = ClassAgnosticPixelwiseIoUScorer().score(actual, predicted)


Why moving to ClassAgnosticPixelWiseIOUScorer()?

Because COCO scorer errored when model predicts category not in the JSON.

Kitware/nrtk#1

vicentebolea · 2024-06-19T23:24:56Z

src/nrtk_explorer/library/dataset.py

+datasetJson: Dataset = {"categories": [], "images": [], "annotations": []}
+_datasetPath: str = ""


pep8 naming here here and I would remove the _.

vicentebolea · 2024-06-19T23:25:09Z

src/nrtk_explorer/library/dataset.py

+_datasetPath: str = ""
+
+
+def getDataset(path: str) -> Dataset:


add a force arg here

What do you mean? string_path = str(path)?

Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 4.5.2 to 4.5.3. - [Release notes](https://github.com/vitejs/vite/releases) - [Changelog](https://github.com/vitejs/vite/blob/v4.5.3/packages/vite/CHANGELOG.md) - [Commits](https://github.com/vitejs/vite/commits/v4.5.3/packages/vite) --- updated-dependencies: - dependency-name: vite dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com>

- Toggle ground truth/predictions annotations for source and transformation images. - Compute COCO score (NRTK) instead of embeddings cartesian distance. - Add unit tests for the coco_utils.py newly module.

Avoids error when Object Detection Model outputs category that is not in COCO JSON.

Annotation similarity scoring requires the transformed image to be the same size as the original image.

Was showing repetitive category ID for object detection model annotations.

Shifting tooltip if out of window was not enough when table size was small, and tooltip would clip under the table footer.

and trame_utils

Always compute score against ground truth.

PaulHax · 2024-06-20T21:40:05Z

superseded by Kitware#76

PaulHax force-pushed the image-score branch from da1b755 to d9cf381 Compare June 13, 2024 00:11

PaulHax commented Jun 13, 2024

View reviewed changes

PaulHax force-pushed the image-score branch from d9cf381 to 8be4b2a Compare June 14, 2024 18:13

PaulHax mentioned this pull request Jun 14, 2024

Compute COCOscore for ground truth and transformations Kitware/nrtk-explorer#61

Merged

PaulHax force-pushed the image-score branch 5 times, most recently from e54cdf8 to 573b474 Compare June 17, 2024 21:08

vicentebolea requested changes Jun 19, 2024

View reviewed changes

PaulHax force-pushed the image-score branch from f8f4067 to cf91ab5 Compare June 20, 2024 18:05

vicentebolea and others added 14 commits June 20, 2024 17:16

feat: toggle ground truth/predictions annotations

0ab0b5b

- Toggle ground truth/predictions annotations for source and transformation images. - Compute COCO score (NRTK) instead of embeddings cartesian distance. - Add unit tests for the coco_utils.py newly module.

fix(transforms): use class agnostic scorer

9a89c21

Avoids error when Object Detection Model outputs category that is not in COCO JSON.

fix(transforms): resize transformed image for scoring

aa01bc0

Annotation similarity scoring requires the transformed image to be the same size as the original image.

refactor: add image_ids and trame_utils modules

e96a3ab

feat(ImageDetection): only show annotation ID for ground truth

32a9cd8

Was showing repetitive category ID for object detection model annotations.

fix(ImageDetection): keep tooltip from clipping under table

650056b

Shifting tooltip if out of window was not enough when table size was small, and tooltip would clip under the table footer.

feat(image_list): replace grid switch with toggle button

cd49f12

refactor(transforms): remove pointless state.image_kinds

b2977ca

refactor: fix typing on object_detector, nrtk_transforms

5463730

and trame_utils

feat(transforms): always show ground truth

5e530e5

Always compute score against ground truth.

fix: score 0 when no predictions, dataset switching bug

2bf9dd9

fix(transforms): compute score for truth to transform correctly

379ee8f

fix: show object detection label if no category in dataset

e6c6f39

refactor(object_detector): eval returns image_id keyed dict

2f72d23

PaulHax force-pushed the image-score branch from cf91ab5 to f7a95db Compare June 20, 2024 21:19

fix: dataset.py snake case and remove unused imports

82c6c55

PaulHax force-pushed the image-score branch from f7a95db to 82c6c55 Compare June 20, 2024 21:21

PaulHax closed this Jun 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ground Truth and Annotation Score tweeks #1

Ground Truth and Annotation Score tweeks #1

PaulHax commented Jun 12, 2024 •

edited

Loading

PaulHax Jun 13, 2024

vicentebolea Jun 19, 2024

PaulHax Jun 20, 2024

PaulHax commented Jun 17, 2024

vicentebolea left a comment

vicentebolea Jun 19, 2024

PaulHax Jun 20, 2024 •

edited

Loading

vicentebolea Jun 19, 2024

PaulHax Jun 20, 2024

vicentebolea Jun 19, 2024

PaulHax Jun 20, 2024

vicentebolea Jun 19, 2024

PaulHax Jun 20, 2024

PaulHax Jun 20, 2024

PaulHax Jun 20, 2024

vicentebolea Jun 20, 2024

vicentebolea Jun 19, 2024

vicentebolea Jun 19, 2024

PaulHax Jun 20, 2024

vicentebolea Jun 19, 2024

PaulHax Jun 20, 2024 •

edited

Loading

vicentebolea Jun 19, 2024

PaulHax Jun 20, 2024

vicentebolea Jun 19, 2024

PaulHax Jun 20, 2024

PaulHax commented Jun 20, 2024

		datasetJson: Dataset = {"categories": [], "images": [], "annotations": []}
		_datasetPath: str = ""

		SourceImageId = str
		TransformedImageId = str

Ground Truth and Annotation Score tweeks #1

Ground Truth and Annotation Score tweeks #1

Conversation

PaulHax commented Jun 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PaulHax commented Jun 17, 2024

vicentebolea left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PaulHax Jun 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PaulHax Jun 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PaulHax commented Jun 20, 2024

PaulHax commented Jun 12, 2024 •

edited

Loading

PaulHax Jun 20, 2024 •

edited

Loading

PaulHax Jun 20, 2024 •

edited

Loading