Generalise package structure for a different underlying BP model #68

sophiemeakin · 2024-01-10T15:15:54Z

The current version of the package is for a very specific branching process model. The package could be generalised to work for any (reasonable) branching process - but this needs some thought about how to define outbreak_setup() and outbreak_step() for a new process, what properties of the process to track over time, as well as how to pass parameters of the new process in a more general way (discussed already in #65).

The text was updated successfully, but these errors were encountered:

sophiemeakin · 2024-01-11T11:15:20Z

For example, pepbp package

pearsonca · 2024-01-11T14:40:29Z

Hmm - so is the idea here to have

res <- scenario_sim(
  n = 10, # n is the stand name for number of samples
  parms = ringbp::parameters(), # the deSolve mispelling is widely used for inexplicable reasons 
  control = ringbp::control_opts() # gives defaults when empty
)

...become

res <- scenario_sim(
  n = 10, # n is the stand name for number of samples
  parms = ringbp::parameters(), # the deSolve mispelling is widely used for inexplicable reasons 
  control = ringbp::control_opts(), # gives defaults when empty
  model = ringbp::model() # gives standard model by default, but allows inserting a new step / draw function
)

where the problems you highlight with defining outbreak_step e.g. live inside that model function somewhere.

That seems potentially workable to me - I suspect the main other challenges here will be:

how to have the options / arbitrary model work together
how to ensure that the new measures associated with the model (e.g. pep doses administered?) get returned
possibly adapting some of the other tools to tolerate the extra data in results

On balance, seems like a good idea. Should be lots of shared infrastructure among BP-based models.

sophiemeakin · 2024-01-11T16:20:39Z

That's what I was thinking, yes.

Re your last two points (returning new measures associated with the model, and adapting other tools to tolerate the extra data), there could be two outputs from each scenario/iteration: one core output that is shared across all models (generation, infector, number of secondary cases), and a second with the model-specific outputs (e.g. for pepbp number of high- and low-risk contacts, number of PEP doses).

pearsonca · 2024-01-11T17:14:32Z

I'd guess the most-likely-to-make-this-happen next step is to provide (here?) focused snippets showing where pepbp had to diverge from ringbp to do what you needed. I imagine that development went roughly "copy-paste-modify" - highlight the modifications?

Then folks can suggest options for implementing a generic approach that would accommodate this specific example.

n.b. if you elect to put some snippets here, there should be useful formatting syntax ala https://stackoverflow.com/questions/36634252/how-to-diff-in-a-comment-on-github - doesn't seem to support side-by-side, alas.

somefunction <- function() {
...
same-same
- subtracted line
+ added line
same-same
...
}

adamkucharski · 2024-01-13T12:05:08Z

Agree would be nice to think about generalisations, especially if there are modular components that could be reused. In case helpful to inform plans, a couple of related efforts:

We did a couple of covid studies by developing a network model with ringbp processes as the 'engine' for transmission and delays. It was coded as a single model rather than being nicely modular, but there may be some links with aims of packages like pepbp and how to structure these.
The Ebola ring vaccination modelling that ringbp was partly based on had a multi-type branching process, with different offspring distributions for detected and missed cases (to represent superspreading outside known transmission clusters). This was hard-coded in quite an inflexible way, but sounds like there are some parallels with other desirable extensions to the basic BP approach.
We've had some discussion of tools for quick simulated linelist and contact datasets in a {simulist} issue here, so some of these approximations/limitation may also be relevant to any efforts to expand ringbp to simulate contacts as well as transmission

sbfnk · 2024-01-17T10:38:40Z

I think supporting custom models here is a really good idea - although it wouldn't add anything in terms of making wider functionality (say, PEP or network models) available to others if they would still live in separate repos. Potentially providing a custom interface here would even discourage people from sharing their models vs. forking something that already is a package and can be re-badged as another (see {pepbp} and {covidhm}) that is then ready to use by others.

The alternative option would be to provide some documentation on "how to spin off your own model by using a fork", bringing in any lessons learned in previous attempts / streamlining the package to make that as easy as possible.

Bisaloo · 2024-01-17T10:57:54Z

It's a tough problem.

The alternative option would be to provide some documentation on "how to spin off your own model by using a fork", bringing in any lessons learned in previous attempts / streamlining the package to make that as easy as possible.

In general and from an open-source ecosystem point of view, this is not great. Any general improvements or bug fixes that could be made in a specific fork may (=will likely) not be shared with the upstream repo, and will not benefit others. We end up with an extremely fragmented ecosystem where anyone that want to build upon existing work has to browse through a potentially large number of repos to maybe eventually find the info they need.

On the other hand, creating a general framework that can accommodate arbitrary models is very hard and I feel we may end up with either something unwieldy or something unflexible that doesn't address the need reported here.

🤔

pearsonca · 2024-01-17T11:57:39Z

The alternative option would be to provide some documentation on "how to spin off your own model by using a fork", bringing in any lessons learned in previous attempts / streamlining the package to make that as easy as possible.

In general and from an open-source ecosystem point of view, this is not great. Any general improvements or bug fixes that could be made in a specific fork may (=will likely) not be shared with the upstream repo, and will not benefit others. We end up with an extremely fragmented ecosystem where anyone that want to build upon existing work has to browse through a potentially large number of repos to maybe eventually find the info they need.

Agree - one solution is to make it easy to add new model plugin implementations to the repo, with a clear process for adding contributor credit. That's an update to the contributor guide, naming plan, and probably an approach that sends things to inst/. Upside being that people are then encouraged to promote citation of their paper + the package (which now contains their model).

On the other hand, creating a general framework that can accommodate arbitrary models is very hard and I feel we may end up with either something unwieldy or something unflexible that doesn't address the need reported here.

Possibly, but: seems worth trying and y'all have a concrete example to develop accommodations against.

adamkucharski · 2024-01-17T12:06:55Z

A version of this problem came up in a recent discussion with simulist (epiverse-trace/simulist#58), when we wanted to incorporate simulation of contacts as well as secondary cases into epichains. This is a relatively simple change to make to a branching process model (i.e. add a line of code to generate contacts, then another to draw the secondary cases from secondary attack rate, as well as book keeping for contacts and SAR as input). So it is a relatively small change that would generate an output (contacts) potentially useful to several people and could easily be turned off without slowing down code or disrupting existing implementations.

So perhaps a rough calculation for whether we add more complexity to base models in package vs encourage separate adaptions to specific use cases (and provide documentation on how this could be done with book keeping etc.) could be something like:

Decision weighting = (Wider usefulness of new functionality) / (Disruption to existing code/arguments to include this functionality)

pearsonca · 2024-01-17T12:51:47Z

A version of this problem came up in a recent discussion with simulist (epiverse-trace/simulist#58), when we wanted to incorporate simulation of contacts as well as secondary cases into epichains. This is a relatively simple change to make to a branching process model (i.e. add a line of code to generate contacts, then another to draw the secondary cases from secondary attack rate, as well as book keeping for contacts and SAR as input). So it is a relatively small change that would generate an output (contacts) potentially useful to several people and could easily be turned off without slowing down code or disrupting existing implementations.

So two examples to work against. Probaby three: https://github.com/SACEMA/COVID10k/ is bpmodels + offspring events - could have just written the contingent event generation as a model extension.

Decision weighting = (Wider usefulness of new functionality) / (Disruption to existing code/arguments to include this functionality)

Dunno about the denominator there (though my impression is that y'all view this package as needing more general interface overall, so seems mostly like "disruption to code" only), but a easy-to-use-and-extend package for a standard modeling concept seems obviously widely useful. BP are (or should be) standard introductory models, which remain useful for low-resolution estimates, but feels like mostly people re-invent the wheel every time.

jamesmbaazam · 2024-07-10T15:30:31Z

Josh and I had a chat with Dillon (@dcadam) today after his seminar based his paper on time-varying superspreading, which used ringbp to simulate a contact network for COVID-19 and SARS. It'd be nice to hear his perspective on this issue on ringbp customisation: how easy it was to use it out of the box, what would have made it more easily customized for his use case, etc.

dcadam · 2024-07-23T12:18:02Z

Hi James. For the network simulation in my paper, I actually used bcgov/epi.branch.sim which itself references ringbp. But this wasn't so easy to customise, but it may fit others' use cases. I can look more into the underlying code structure of ringbp compared to epi.branch.sim and where improvements could be found to make it more customisable as you ask.

adamkucharski mentioned this issue Mar 12, 2024

Document how to make simple, developer-side model edits epiverse-trace/epidemics#193

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalise package structure for a different underlying BP model #68

Generalise package structure for a different underlying BP model #68

sophiemeakin commented Jan 10, 2024

sophiemeakin commented Jan 11, 2024

pearsonca commented Jan 11, 2024 •

edited

Loading

sophiemeakin commented Jan 11, 2024

pearsonca commented Jan 11, 2024 •

edited

Loading

adamkucharski commented Jan 13, 2024

sbfnk commented Jan 17, 2024

Bisaloo commented Jan 17, 2024

pearsonca commented Jan 17, 2024

adamkucharski commented Jan 17, 2024

pearsonca commented Jan 17, 2024

jamesmbaazam commented Jul 10, 2024

dcadam commented Jul 23, 2024

Generalise package structure for a different underlying BP model #68

Generalise package structure for a different underlying BP model #68

Comments

sophiemeakin commented Jan 10, 2024

sophiemeakin commented Jan 11, 2024

pearsonca commented Jan 11, 2024 • edited Loading

sophiemeakin commented Jan 11, 2024

pearsonca commented Jan 11, 2024 • edited Loading

adamkucharski commented Jan 13, 2024

sbfnk commented Jan 17, 2024

Bisaloo commented Jan 17, 2024

pearsonca commented Jan 17, 2024

adamkucharski commented Jan 17, 2024

pearsonca commented Jan 17, 2024

jamesmbaazam commented Jul 10, 2024

dcadam commented Jul 23, 2024

pearsonca commented Jan 11, 2024 •

edited

Loading

pearsonca commented Jan 11, 2024 •

edited

Loading