Skip to content

Releases: NannyML/nannyml

v0.13.0

14 Jan 22:41
Compare
Choose a tag to compare

Fixed

  • Fixed incorrect default thresholds in the docstrings for the univariate drift calculator. Thanks for the eagle-eyed reading @josecaloca! (#425)

Changed

  • Thorough revamp of our dependency version specifications. Dependencies are now less strict, making it easier to use NannyML as a dependency. Big credits to @davisthomas-aily and @canoadri for their contributions, thoughts and patience on this one. Much appreciated! (#433)
  • Added support for Python 3.12
  • Dropped support for Python 3.8

v0.12.1

06 Sep 21:35
Compare
Choose a tag to compare

Fixed

  • Fixed component filtering misbehaving for CBPE results. (#423)

v0.12.0

06 Sep 14:33
v0.12.0
c03518b
Compare
Choose a tag to compare

Fixed

  • Fixed broken links in usage logging docs. Cheers once more to @NeoKish! (#417)
  • Fixed issues with runner type validation due to changes in Pydantic 2 behavior. (#421)
  • Fixed a typo in one the plotting blueprint modules. Eagle eyes @nikml! (#418)

Added

  • Added multiclass support for estimated and realized performance metrics average_precision and business_value. (#409)
  • Added threshold value limits for multiclass metrics. (#411)

Changed

  • Made the dependencies required for database access optional. Big thanks to @Duncan-Hunter
  • Improved denominator checks in CBPE base estimation functions. (#416)
  • Relaxed constraints for the rich dependency. (#422)

Removed

  • Dropped support for Python 3.7 as it was causing major issues with dependencies. (#410)

v0.11.0

19 Jul 21:12
2f47f84
Compare
Choose a tag to compare

Changed

  • Updated Pydantic to ^2.7.4, SQLModel to ^0.0.19. (#401)
  • Removed the drop_duplicates step from the DomainClassifier for a further speedup. (#402)
  • Reverted to previous working dependency configuration for matplotlib as the current one causes issues in conda. (#403)

Fixed

  • Added DomainClassifier method for drift detection to be run in the CLI.
  • Fixed NaN handling for multiclass confusion matrix estimation in CBPE. (#400)
  • Fixed incorrect handling of columns marked as categorical in Wasserstein and Hellinger drift detection methods.
    The treat_as_categorical value was ignored. We've also added a treat_as_continuous column to explicitly mark columns as continuous.
    (#404)
  • Fixed an issue with multiclass AUROC calculation and estimation when not all classes are available in a
    reference chunk during fitting. (#405)

Added

  • Added a new data quality calculator to check if continuous values in analysis data are within the ranges
    encountered in the reference data. Big thanks to @jnesfield! Still needs some documentation...
    (#408)

v0.10.7

07 Jun 12:28
v0.10.7
a24ab81
Compare
Choose a tag to compare

Changed

  • Optimized summary stats and overall performance by avoiding unnecessary copy operations and index resets in during chunking
    (#390)
  • Optimized performance of nannyml.base.PerMetricPerColumnResult filter operations by adding a short-circuit path
    when only filtering on period. (#391)
  • Optimized performance of all data quality calculators by avoiding unnecessary evaluations and avoiding copy and index reset operations
    (#392)

Fixed

  • Fixed an issue in the Wasserstein "big data heuristic" where outliers caused the binning to cause out-of-memory errors. Thanks! @nikml!
    (#393)
  • Fixed a typo in the salary_range values of the synthetic car loan example dataset. 20K - 20K € is now 20K - 40K €.
    (#395)

v0.10.6

16 May 14:48
Compare
Choose a tag to compare

Changed

  • Make predictions optional for performance calcuation. When not provided, only AUROC and average precision will be calculated. (#380)
  • Small DLE docs updates
  • Combed through and optimized the reconstruction error calculation with PCA resulting in a nice speedup. Cheers @nikml! (#385)
  • Updated summary stats value limits to be in line with the rest of the library. Changed from np.nan to None. (#387)

Fixed

  • Fixed a breaking issue in the sampling error calculation for the median summary statistic when there is only a single value for a column. (#377)
  • Drop identifier column from the documentation example for reconstruction error calculation with PCA. (#382)
  • Fix an issue where default threshold configurations would get changed when upon setting custom thresholds, bad mutables! (#386)

v0.10.5

08 Mar 13:17
v0.10.5
7da83e7
Compare
Choose a tag to compare

Changed

  • Updated dependencies for Python 3.8 and up. (#375)

Added

  • Support for the average precision metric for binary classification in realized and estimated performance. (#374)

v0.10.4

04 Mar 15:44
v0.10.4
17430fc
Compare
Choose a tag to compare

Changed

  • We've changed the defaults for the incomplete parameter in the SizeBasedChunker and CountBasedChunker
    to keep from the previous append. This means that from now on, by default, you might have an additional
    "incomplete" final chunk. Previously these records would have been appended to the last "complete" chunk.
    This change was required for some internal developments, and we also felt it made more sense when looking at
    continuous monitoring (as the incomplete chunk will be filled up later as more data is appended). (#367)
  • We've renamed the Classifier for Drift Detection (CDD) to the more appropriate Domain Classifier. (#368)
  • Bumped the version of the pyarrow dependency to ^14.0.0 if you're running on Python 3.8 or up.
    Congrats on your first contribution here @amrit110, much appreciated!

Fixed

  • Continuous distribution plots will now be scaled per chunk, as opposed to globally. (#369)

v0.10.3

17 Feb 00:46
Compare
Choose a tag to compare

Fixed

  • Handle median summary stat calculation failing due to NaN values
  • Fix standard deviation summary stat sampling error calculation occasionally returning infinity (#363)
  • Fix plotting confidence bands when value gaps occur (#364)

Added

  • New multivariate drift detection method using a classifier and density ration estimation.

v0.10.2

13 Feb 00:35
v0.10.2
926b0a5
Compare
Choose a tag to compare

Changed

  • Removed p-value based thresholds for Chi2 univariate drift detection (#349)
  • Change default thresholds for univariate drift methods to standard deviation based thresholds.
  • Add summary stats support to the Runner and CLI (#353)
  • Add unique identifier columns to included datasets for better joining (#348)
  • Remove unused confidence_deviation properties in CBPE metrics (#357)
  • Improved error handling: failing metric calculation for a single chunk will no longer stop an entire calculator.

Added

  • Add feature distribution calculators (#352)

Fixed

  • Fix join column settings for CLI (#356)
  • Fix crashes in UnseenValuesCalculator