Optimize model fit to avoid allocations in hot loops #28

juliohm · 2025-03-06T18:47:12Z

Introduce fit! to update model's state in place.

codecov-commenter · 2025-03-06T18:52:35Z

Codecov Report

Attention: Patch coverage is 96.73913% with 6 lines in your changes missing coverage. Please review.

Project coverage is 95.49%. Comparing base (164fb0c) to head (cecc6df).

Files with missing lines	Patch %	Lines
src/krig.jl	94.66%	4 Missing ⚠️
src/lwr.jl	96.66%	1 Missing ⚠️
src/poly.jl	97.72%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #28      +/-   ##
==========================================
+ Coverage   95.10%   95.49%   +0.39%     
==========================================
  Files          10       10              
  Lines         449      555     +106     
==========================================
+ Hits          427      530     +103     
- Misses         22       25       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

juliohm · 2025-03-10T18:28:49Z

For some reason the usage of fit! in fitpredict leads to inferior performance:

using GeoStatsModels
using GeoStatsFunctions
using GeoTables
using Meshes

using BenchmarkTools

data = georef((; z=[1.0, 0.0, 1.0]), [(25.0, 25.0), (50.0, 75.0), (75.0, 50.0)])
grid = CartesianGrid(100, 100)

model = NN()
# model = IDW()
# model = LWR()
# model = Polynomial()
# model = Kriging(SphericalVariogram())

@benchmark GeoStatsModels.fitpredict($model, $data, $grid, maxneighbors=3)

The benchmark results are shown below:

## NN (main branch)

BenchmarkTools.Trial: 836 samples with 1 evaluation per sample.
 Range (min … max):  5.430 ms … 20.788 ms  ┊ GC (min … max): 0.00% … 71.01%
 Time  (median):     5.532 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.978 ms ±  1.326 ms  ┊ GC (mean ± σ):  6.46% ± 12.28%

  ██▆▂     ▁  ▁                                     ▁▁        
  ████▁▄▅▁███▅██▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁████▆▇▆▅ ▇
  5.43 ms      Histogram: log(frequency) by time     9.72 ms <

 Memory estimate: 5.19 MiB, allocs estimate: 170054.

## NN (pull request)

BenchmarkTools.Trial: 690 samples with 1 evaluation per sample.
 Range (min … max):  6.516 ms … 24.152 ms  ┊ GC (min … max): 0.00% … 69.98%
 Time  (median):     6.652 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   7.244 ms ±  1.821 ms  ┊ GC (mean ± σ):  7.23% ± 13.20%

  ▇█▄                                                   ▁     
  ███▇▄▄▁▆▆▄▄▇██▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅█▇██▇▅ ▇
  6.52 ms      Histogram: log(frequency) by time     12.5 ms <

 Memory estimate: 6.87 MiB, allocs estimate: 210061.

## IDW (main branch)

BenchmarkTools.Trial: 38 samples with 1 evaluation per sample.
 Range (min … max):  128.401 ms … 156.452 ms  ┊ GC (min … max): 0.00% … 10.89%
 Time  (median):     133.736 ms               ┊ GC (median):    3.30%
 Time  (mean ± σ):   134.215 ms ±   5.016 ms  ┊ GC (mean ± σ):  2.75% ±  2.14%

  ▂         ▂█▆                                                  
  ██▁▁▄▁▁▁▄▆███▆▄▄▁▄▁▁▁▁▁▄▁▁▄▁▁▄▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▁
  128 ms           Histogram: frequency by time          156 ms <

 Memory estimate: 48.68 MiB, allocs estimate: 1190053.

## IDW (pull request)

BenchmarkTools.Trial: 38 samples with 1 evaluation per sample.
 Range (min … max):  128.775 ms … 158.051 ms  ┊ GC (min … max): 0.00% … 11.11%
 Time  (median):     134.404 ms               ┊ GC (median):    3.83%
 Time  (mean ± σ):   134.929 ms ±   5.238 ms  ┊ GC (mean ± σ):  2.73% ±  2.48%

  ▁▁         █    ▁                                              
  ██▁▄▁▁▆▁▁▁▆█▄▆▇▁█▁▄▁▁▁▄▆▁▄▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▁
  129 ms           Histogram: frequency by time          158 ms <

 Memory estimate: 49.90 MiB, allocs estimate: 1230060.

## LWR (main branch)

BenchmarkTools.Trial: 198 samples with 1 evaluation per sample.
 Range (min … max):  23.039 ms … 45.601 ms  ┊ GC (min … max): 0.00% … 47.44%
 Time  (median):     23.465 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   25.259 ms ±  2.917 ms  ┊ GC (mean ± σ):  6.21% ±  8.34%

  ▅█▄                  ▄▅▂                                     
  ███▆▁▁▁▁▁▄▁▄▁▁▁▁▁▁▁▁████▇▁▄▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁▁▄▄▁▁▁▁▁▁▁▄ ▄
  23 ms        Histogram: log(frequency) by time      35.2 ms <

 Memory estimate: 20.91 MiB, allocs estimate: 500053.

## LWR (pull request)

BenchmarkTools.Trial: 222 samples with 1 evaluation per sample.
 Range (min … max):  20.649 ms … 46.601 ms  ┊ GC (min … max): 0.00% … 47.50%
 Time  (median):     21.127 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   22.538 ms ±  2.912 ms  ┊ GC (mean ± σ):  5.50% ±  8.75%

  ▃█▆▁                   ▂▄▁                                   
  ████▁▁▁▁▁▁▅▁▅▁▄▇▆▁▁▇▁▁▄███▁▁▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁▄ ▅
  20.6 ms      Histogram: log(frequency) by time      33.3 ms <

 Memory estimate: 16.94 MiB, allocs estimate: 410062.

## Polynomial (main branch)

BenchmarkTools.Trial: 147 samples with 1 evaluation per sample.
 Range (min … max):  30.486 ms … 53.098 ms  ┊ GC (min … max): 0.00% … 29.93%
 Time  (median):     35.736 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   34.071 ms ±  3.865 ms  ┊ GC (mean ± σ):  8.44% ±  8.47%

  █▁           ▁▇▄                                             
  ██▁▁▁▁▁▁▁▁▆▁▁███▁▁▄▁▆▆▄▁▁▄▄▁▁▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁▄ ▄
  30.5 ms      Histogram: log(frequency) by time      52.8 ms <

 Memory estimate: 40.13 MiB, allocs estimate: 970053.

## Polynomial (pull request)

BenchmarkTools.Trial: 154 samples with 1 evaluation per sample.
 Range (min … max):  29.052 ms … 56.375 ms  ┊ GC (min … max): 0.00% … 38.39%
 Time  (median):     29.749 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   32.578 ms ±  4.006 ms  ┊ GC (mean ± σ):  8.23% ±  9.21%

  ▃▆█                   ▄▆▄                                    
  ███▁▁▁▁▁▁▅▁▁▅▁▅▁▅▁▁▁▁▅███▁▁▁▅▇▅▁▁▁▁▁▁▁▁▁▅▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅ ▅
  29.1 ms      Histogram: log(frequency) by time      45.4 ms <

 Memory estimate: 37.54 MiB, allocs estimate: 970106.

## Kriging (main branch)

BenchmarkTools.Trial: 215 samples with 1 evaluation per sample.
 Range (min … max):  17.567 ms … 47.309 ms  ┊ GC (min … max):  0.00% … 54.07%
 Time  (median):     23.503 ms              ┊ GC (median):    23.45%
 Time  (mean ± σ):   23.304 ms ±  3.688 ms  ┊ GC (mean ± σ):  22.51% ± 10.79%

  ▂▃              ▆█▃                                          
  ███▁▁▁▄▁▁▁▁▁▁▁▁▁███▆▄▁▄▁▁▁▁▄▁▄▁▁▄█▅▄▄▁▁▁▄▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▄ ▅
  17.6 ms      Histogram: log(frequency) by time      37.8 ms <

 Memory estimate: 37.08 MiB, allocs estimate: 470053.

## Kriging (pull request)

BenchmarkTools.Trial: 198 samples with 1 evaluation per sample.
 Range (min … max):  19.321 ms … 55.111 ms  ┊ GC (min … max):  0.00% … 55.36%
 Time  (median):     26.136 ms              ┊ GC (median):    24.92%
 Time  (mean ± σ):   25.267 ms ±  4.427 ms  ┊ GC (mean ± σ):  21.97% ± 11.91%

  ▅▂             ▄█▄                                           
  ██▄▁▁▁▁▁▁▁▁▁▁▁▁███▅▄▁▄▄▁▁▁▁▁▄▁▁▁▁▁▁▄▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▄▁▁▁▁▁▁▄ ▄
  19.3 ms      Histogram: log(frequency) by time        44 ms <

 Memory estimate: 36.78 MiB, allocs estimate: 480087.

@eliascarv do you know what might be causing this? Perhaps the change to mutable model states is the root of the issue?

EDIT: the change to mutable is not the cause because the KrigingState was already mutable before this PR, for instance.

eliascarv · 2025-03-11T12:39:27Z

@juliohm, can you check the type stability in main compared to this PR?

juliohm · 2025-03-11T12:59:24Z

@eliascarv the output of @code_warntype is all green for GeoStatsModels.fit, GeoStatsModels.fit! and GeoStatsModels.predict. The output of GeoStatsModels.fitpredict is not because it returns a GeoTable.

Even the benchmarks for GeoStatsModels.fit are slower in this PR compared to the main branch.

Store LHS and FHS in Kriging state

713c39f

juliohm added 20 commits March 6, 2025 16:36

Refactor krig.jl

829c055

More refactoring

40ad603

Finish initial fit! implementation

4feccfc

Fix type instabilities

d568c82

Refactor lwr.jl

f8f5205

Use fit! in fitpredict

179a3ce

Implement fit! for IDW

cab2732

Implement fit! for NN

c3c470f

Implement fit! for LWR

e1a7f62

Refactor poly.jl

e3ea4ac

Refactor poly.jl

94203e1

Fix lwr.jl

6187bdc

Refactor lwr.jl

ac24dc5

Refactor poly.jl

0131cad

Implement fit! for Polynomial

1567d06

More fixes to Polynomial

fe5ec3a

More fixes to Polynomial

2de12ec

Add more tests for fit!

efcd2b9

Add more tests for fit!

61087b1

Add more tests for fitpredict

25e44b6

juliohm marked this pull request as ready for review March 10, 2025 13:27

Refactor krig.jl

cecc6df

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize model fit to avoid allocations in hot loops #28

Optimize model fit to avoid allocations in hot loops #28

juliohm commented Mar 6, 2025 •

edited

Loading

codecov-commenter commented Mar 6, 2025 •

edited

Loading

juliohm commented Mar 10, 2025 •

edited

Loading

eliascarv commented Mar 11, 2025

juliohm commented Mar 11, 2025

Optimize model fit to avoid allocations in hot loops #28

Are you sure you want to change the base?

Optimize model fit to avoid allocations in hot loops #28

Conversation

juliohm commented Mar 6, 2025 • edited Loading

codecov-commenter commented Mar 6, 2025 • edited Loading

Codecov Report

juliohm commented Mar 10, 2025 • edited Loading

eliascarv commented Mar 11, 2025

juliohm commented Mar 11, 2025

juliohm commented Mar 6, 2025 •

edited

Loading

codecov-commenter commented Mar 6, 2025 •

edited

Loading

juliohm commented Mar 10, 2025 •

edited

Loading