Skip to content

Conversation

shaneahmed
Copy link
Member

@shaneahmed shaneahmed commented Sep 20, 2024

Summary of Changes

Major Additions

  • Dask Integration:

    • Added dask as a dependency and integrated Dask arrays and lazy computation throughout the engine and patch predictor code.
    • Added Dask-based merging, chunking, and memory-aware processing for large images and WSIs.
  • Zarr Output Support:

    • Added support for saving model predictions and intermediate results directly to Zarr format.
    • New CLI options and internal logic for Zarr output, including memory thresholding and chunked writes.
  • SemanticSegmentor Engine:

    • Added a new SemanticSegmentor engine with Dask/Zarr support and new test coverage (test_semantic_segmentor.py).
    • Added CLI entrypoint for semantic_segmentor and removed the old semantic_segment CLI.
  • Enhanced CLI and Config:

    • Added CLI options for memory threshold, unified worker options, and improved mask handling.
    • Updated YAML configs and sample data for new models and test images.
  • Utilities and Validation:

    • Added utility functions for minimal dtype casting, patch/stride validation, and improved error handling (e.g., DimensionMismatchError).
    • Improved annotation store conversion for Dask arrays and Zarr-backed outputs.
  • Changes to kwarg

    • Add memory-threshold
    • Unified num-loader-workers and num-postproc-workers into num-workers
    • Removed cache_mode as cache mode is automatically handled.

Major Removals/Refactors

  • Removed Old CLI and Redundant Code:

    • Deleted the old semantic_segment.py CLI and replaced it with semantic_segmentor.py.
    • Removed legacy cache mode and patch prediction Zarr store tests.
  • Refactored Model and Dataset APIs:

    • Unified and simplified model inference APIs to always return arrays (not dicts) for batch outputs.
    • Refactored dataset classes to enforce patch shape validation and remove legacy “mode” logic.
  • Test Cleanup:

    • Removed or updated tests that relied on old APIs or cache mode.
    • Refactored test assertions for new output types and Dask array handling.
  • API Consistency:

    • Standardized function and argument names across engines, CLI, and utility modules.
    • Updated docstrings and type hints for clarity and consistency.

Notable File Changes

  • New:

    • tiatoolbox/cli/semantic_segmentor.py
    • tests/engines/test_semantic_segmentor.py
  • Removed:

    • tiatoolbox/cli/semantic_segment.py
    • Old cache mode and patch Zarr store tests
  • Heavily Modified:

    • engine_abc.py, patch_predictor.py, semantic_segmentor.py
    • CLI modules and test suites
    • Dataset and utility modules for Dask/Zarr compatibility

Impact

  • Enables scalable, parallel, and memory-efficient inference and output saving for large images.
  • Simplifies downstream analysis by supporting Zarr as a native output format.
  • Lays the groundwork for further Dask-based optimizations in TIAToolbox.

@shaneahmed shaneahmed self-assigned this Sep 20, 2024
@shaneahmed shaneahmed added the enhancement New feature or request label Sep 20, 2024
@shaneahmed shaneahmed added this to the Release v2.0.0 milestone Sep 20, 2024
Copy link

codecov bot commented Sep 20, 2024

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.08%. Comparing base (283bd22) to head (7d363e8).

Additional details and impacted files
@@                    Coverage Diff                     @@
##           dev-define-engines-abc     #866      +/-   ##
==========================================================
+ Coverage                   91.77%   95.08%   +3.30%     
==========================================================
  Files                          73       73              
  Lines                        9354     9215     -139     
  Branches                     1224     1206      -18     
==========================================================
+ Hits                         8585     8762     +177     
+ Misses                        756      437     -319     
- Partials                       13       16       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@shaneahmed shaneahmed marked this pull request as draft February 5, 2025 16:27
- Use `input_resolutions` instead of resolution to make engines outputs compatible with ioconfig.
- Uses input resolution as a list of dictionaries on units and resolution.
- Use `input_resolutions` instead of resolution to make engines outputs compatible with ioconfig.
- Uses input resolution as a list of dictionaries on units and resolution.
…mentor

# Conflicts:
#	tests/engines/test_engine_abc.py
#	tests/engines/test_patch_predictor.py
#	tiatoolbox/models/engine/engine_abc.py
#	tiatoolbox/models/engine/io_config.py
#	tiatoolbox/models/engine/patch_predictor.py
@Jiaqi-Lv Jiaqi-Lv requested a review from Copilot August 28, 2025 15:25
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request implements a comprehensive refactor of the TIAToolbox engine system, introducing a new abstract base class EngineABC and implementing SemanticSegmentor as an extension of the PatchPredictor. The refactor modernizes the codebase with improved memory management, Dask array integration, and better separation of concerns.

Key changes include:

  • New EngineABC base class providing unified interface for deep learning engines
  • Complete rewrite of SemanticSegmentor extending PatchPredictor with WSI-specific functionality
  • Integration of Dask arrays for memory-efficient processing and caching
  • Enhanced error handling and validation with new exception types

Reviewed Changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tiatoolbox/utils/transforms.py Added int type annotation to interpolation parameter
tiatoolbox/utils/misc.py Enhanced utility functions with Dask integration, memory optimization, and new helper functions
tiatoolbox/utils/exceptions.py Added DimensionMismatchError exception class
tiatoolbox/models/models_abc.py Updated abstract method signatures for improved type safety
tiatoolbox/models/engine/semantic_segmentor.py Complete rewrite implementing new EngineABC architecture
tiatoolbox/models/engine/patch_predictor.py Refactored to extend EngineABC with simplified interface
tiatoolbox/models/engine/engine_abc.py New abstract base class for all TIAToolbox engines
tiatoolbox/models/dataset/dataset_abc.py Enhanced dataset classes with output location tracking and validation
Comments suppressed due to low confidence (1)

tiatoolbox/models/engine/semantic_segmentor.py:535

  • Similar to the previous issue, self.dataloader should be dataloader in the else clause on the following line.
                canvas, count, canvas_zarr, count_zarr

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants