- All parallel backends now prevent nested parallelization, unless
explicitly allowed, e.g. settings recognized by
parallelly::availableCores()
or set by the futureplan()
. Previously, this had to be implemented by each backend, but now it's handled automatically by the future framework.
- Added support for backend maintainers to specify "cleanup" hook
functions on future strategies, which are called when switching
future plan. These hook functions are specified via the optional
cleanup
attribute, cf.attr(cluster, "cleanup")
.
- Size calculation of globals is now done using the much faster
parallelly::serializedSize()
.
resolved()
forClusterFuture
:s would produceError: 'inherits(future, "Future")' is not TRUE
instead of an intended, informative error message that the connection to the parallel worker is broken.
- Decreased the overhead of launching futures that occurred for future
strategies that used a complex
workers
argument. For example,plan(cluster, workers = cl)
, wherecl
is acluster
object, would come with an extra overhead, because theworkers
object was unnecessarily transferred to the cluster nodes.
- Now
plan(multisession, workers = I(n))
, and same forcluster
, preserves the "AsIs" class attribute on theworkers
argument so that it is propagated toparallelly::makeClusterWorkers()
.
- Clarify that packages must not change any of the
future.*
options.
getExpression()
on 'cluster' future could under some circumstances calllocal()
on the global search path rather thanbase::local()
as intended. For example, if a package that exports its ownlocal()
function was attached, then that would be called instead, often leading to a hard-to-troubleshoot error.
- When a 'cluster' future fails to communicate with the parallel worker, it does a post-mortem analysis to figure out why, including inspecting whether the worker process is still alive or not. In previous versions, this only worked for workers running on the current machine. Starting with this version, it also attempts to check this for remote versions.
- If a 'multicore' future failed, because the parallel process crashed, the corresponding parallel-worker slot was never released. Now it is removed if it can confirm that the forked worker process is no longer alive.
- The 'multiprocess' strategy has now been fully removed. Please use 'multisession' (recommended) or 'multicore' instead.
-
Add prototype of an internal event-logging framework for the purpose of profiling futures and their backends.
-
Add option
future.globalenv.onMisuse
for optionally asserting that a future expression does not result in variables being added to the global environment. -
Add option
future.onFutureCondition.keepFuture
for controlling whetherFutureCondition
objects should keep a copy of theFuture
object or not. The default is to keep a copy, but if the future carries large global objects, then theFutureCondition
will also be large, which can result in memory issues and slow downs.
- Fix a future.tests check that occurred only on MS Windows.
-
The 'multiprocess' strategy, which has been deprecated since future 1.20.0 [2020-10-30] is now defunct. Please use 'multisession' (recommended) or 'multicore' instead.
-
Add optional assertion of the internal Future
state
field.
- Remove function
remote()
. Note thatplan(remote, ...)
has been deprecated since future 1.24.0 [2022-02-19] and defunct since future 1.30.0 (2022-12-15).
- Add example to the 'Common Issues with Solutions' vignette on how magrittr pipes can result in an error when used with the future assignment operator and how to fix it.
- Error messages that contain a deparsed version of the future expression could become very large in cases where the expression comprise expanded, large objects. Now only the first 100 lines of the expression is deparsed.
-
Deprecated
plan(multiprocess, ...)
now equalsplan(sequential)
, while still producing one warning each time a future is created. -
Argument
local
is defunct and has been removed. Previously onlylocal = FALSE
was defunct. -
Remove defunct argument
value
from allresolve()
methods. -
Remove defunct functions
transparent()
andTransparentFuture()
.
-
futureOf()
usedlistenv::map()
, which is deprecated in listenv (>= 0.9.0) in favor oflistenv::mapping()
. -
Starting with R (>= 4.2.0), the internal function
myInternalIP()
no longer detected when an attempted system call failed, resulting in an obscure error instead of falling back to alternatives. This was because errors produced bysystem2()
no longer inherits from classsimpleError
.
- Strategy 'remote' was deprecated in future 1.24.0 and is now
defunct. Use
plan(cluster, ..., persistent = TRUE)
instead. Note thatpersistent = TRUE
will eventually also become deprecated and defunct, but by then we will have an alternative solution available.
- Add section 'Making sure to stop parallel workers' to the 'Best
Practices for Package Developers', which explains why
R CMD check
may produce "checking for detritus in the temp directory ... NOTE" and how to avoid them.
- The evaluation of a sequential future would reset any warnings
collected by R prior to creating the future. This only happened
with
plan(sequential)
and whengetOption("warn") == 0
. This bug was introduced in future 1.26.0 [2022-05-27].
-
Using the deprecated
plan(multiprocess)
will now trigger a deprecation warning each time amultiprocess
future is created. This means that there could be a lot of warnings produced. Note thatmultiprocess
has been deprecated since future 1.20.0 [2020-10-30]. Please usemultisession
(recommended) ormulticore
instead. -
Removing
values()
, which has been defunct since future 1.23.0. Usevalue()
instead.
- Mention how
source(..., local = TRUE)
is preferred oversource()
when used inside futures.
do.call(plan, args = list(multisession, workers = 2))
would ignore theworkers
argument, and any other arguments.
-
Previously deprecated use of
local = FALSE
with futures is now defunct. -
The R option to temporarily allow
plan(transparent)
although it was declared defunct has now been removed;plan(transparent)
, together with functionstransparent()
andTransparentFuture()
are now formally defunct. -
Using argument
persistent
with multisession futures is now defunct. Previously onlypersistent = TRUE
was defunct.
-
Use CSS style to align image to the right instead of non-HTML5 attribute
align="right"
. -
Avoid nested
<em>
tags in HTML-generated help pages.
- The fallback to sequential processing done by 'multicore' and
'multisession' when
workers = 1
can now be overridden by specifyingworkers = I(1)
.
-
Some warnings and errors showed the wrong call.
-
print()
forFutureResult
would report captured conditions all with classlist
, instead of their condition classes.
- TESTS:
R CMD check --as-cran
on R-devel and MS Windows would trigger a NOTE on "Check: for detritus in the temp directory" and "Found the following files/directories: 'Rscript1349cb8aeeba0' ...". There were two package tests that explicitly created PSOCK cluster without stopping them. A third test launched multisession future without resolving it, which prevented the PSOCK worker to terminate. This was not detected in R 4.2.0. It is not a problem on macOS and Linux, because there background workers are automatically terminated when the main R session terminates.
-
R options and environment variables are now reset on the workers after future is resolved as they were after any packages required by the future has been loaded and attached. Previously, they were reset to what they were before these were loaded and attached. In addition, only pre-existing R options and environment variables are reset. Any new ones added are not removed for now, because we do not know which added R options or environment variables might have been added from loading a package and that are essential for that package to work.
-
If it was changed while evaluating the future expression, the current working directory is now reset when the future has been resolved.
-
futureSessionInfo()
gained argumentanonymize
. If TRUE (default), host and user names are anonymized. -
futureSessionInfo()
now also report on the main R session details.
-
The bug fix in future 1.22.0 that addressed the issue where object
a
infuture(fcn(), globals = list(a = 42, fcn = function() a))
would not be found has been redesigned in a more robust way. -
Use of packages such as data.table and ff in cluster and multisession futures broke in future 1.25.0. For data.table, we saw "Error in setalloccol(ans) : verbose must be TRUE or FALSE". For ff, we saw "Error in splitted$path[nopath] <- getOption("fftempdir") : replacement has length zero". See 'Significant Changes' for why and how this was fixed.
-
The deprecation warning for using
local = FALSE
was silenced for sequential futures since future 1.25.0. -
futureCall()
ignored argumentsstdout
,conditions
,earlySignal
,label
, andgc
.
-
Strategy 'transparent' was deprecated in future 1.24.0 and is now defunct. Use
plan(sequential, split = TRUE)
instead. -
Strategy 'multiprocess' was deprecated in future 1.20.0, and 'remote' was deprecated in future 1.24.0. Since then, attempts to use them in
plan()
would produce a deprecation warning, which was limited to one per R session. Starting with this release, this warning is now produced whenever usingplan()
with these deprecated future strategies.
- R options and environment variables are now reset on the workers after future is resolved so that any changes to them by the future expression have no effect on following futures.
-
Now
f <- future(..., stdout = structure(TRUE, drop = TRUE))
will cause the captured standard output to be dropped from the future object as soon as it has been relayed once, for instance, byvalue(f)
. Similarly,conditions = structure("conditions", drop = TRUE)
will drop captured non-error conditions as soon as they have been relayed. This can help decrease the amount of memory used, especially if there are many active futures. -
Now
resolve()
respects optionfuture.wait.interval
. Previously, it was hardcoded to poll for results every 0.1 seconds.
- Now,
value()
will only attempt to recover UTF-8 symbols in the captured standard output if the future was evaluated on an MS Windows that does not support capturing of UTF-8 symbols. Support for UTF-8 capturing on also MS Windows was added in R 4.2.0, but it typically requires an up-to-date MS Windows 10 or MS Windows Server 2022.
- The default value for option
future.wait.interval
was decreased from 0.2 seconds to 0.01 seconds. This controls the polling frequency for finding an available worker when all workers are currently busy. Starting with this release, this option also controls the polling frequency ofresolve()
.
- A bug was introduced in future 1.24.0 [2022-02-19] that caused
future plan tweaking to break, e.g.
plan(multicore, workers = 2)
andplan(sequential, split = TRUE)
introduced breaking side effects to the futures evaluated.
- Now
future(..., seed = TRUE)
forwards the RNG state in the calling R session. Previously, it would leave it intact.
- Now
plan()
andtweak()
preserve calls in arguments, e.g.plan(multisession, workers = 2, rscript_startup = quote(options(socketOptions="no-delay")))
, andtweak(..., abc = quote(x == y))
.
-
nbrOfFreeWorkers()
would produce "Error: 'is.character(name)' is not TRUE" forplan(multisession, workers = 1)
. -
Internal calls to
FutureRegistry(action = "collect-first")
andFutureRegistry(action = "collect-last")
could signal errors early when pollingresolved()
.
-
Strategy 'remote' is deprecated in favor of 'cluster'. The
plan()
function will give an informative deprecation warning when 'remote' is used. For now, this warning is given only once per R session. -
Strategy 'transparent' is deprecated in favor of 'sequential' with argument
split = TRUE
set. Theplan()
function will give an informative deprecation warning when 'transparent' is used. For now, this warning is given only once per R session.
-
plan()
now produces a one-time warning if a 'transparent' strategy is set. The warning reminds the user that 'transparent' should only be used for troubleshooting purposes and never be used in production. These daysplan(sequential, split = TRUE)
together withdebug()
is probably a better approach for troubleshooting. The long-term plan is to deprecate the 'transparent' strategy. -
Support for
persistent = TRUE
with multisession futures is defunct.
- UTF-8 symbols outputted on MS Windows would be relayed as escaped
symbols, e.g. a UTF-8 check mark symbol (
\u2713
) would be relayed as<U+2713>
(8 characters). The reason for this is a limitation in R itself on MS Windows. Now,value()
attempts to recover such MS Windows output to UTF-8 before relaying it. There is an option for disabling this new feature.
- TESTS: Using more robust emulation of crashed forked parallel
workers after understanding that
quit()
must not be used in forked R processes.
- Now
future(..., seed)
will set the random seed as late as possible just before the future expression is evaluated. Previously it was done before package dependencies where attached, which could lead to non-reproduce random numbers in case a package dependency would update the RNG seed when attached.
-
values()
, which has been deprecated since future 1.20.0, is now defunct. Usevalue()
instead. -
Support for
persistent = TRUE
with multisession futures is defunct. If still needed, a temporary workaround is to use cluster futures. However, it is likely that support forpersistent
will eventually be deprecated for all future backends. -
Argument
value
ofresolve()
, deprecated since future 1.15.0, is defunct in favor of argumentresult
.
- Disable package test that emulates crashing of forked parallel
workers when using
parallel::makeCluster(..., type = "FORK")
. This test is disabled on macOS, where it appears that the main R session becomes unstable after the FORK node is terminated.
-
A lazy future remains a generic future until it is launched, which means it is not assigned a future backend class until launched.
-
Argument
seed
forfutureAssign()
andfutureCall()
now defaults to FALSE just like forfuture()
. -
R_FUTURE_*
environment variables are now only read when the future package is loaded, where they set the correspondingfuture.*
option. Previously, some of these environment variables were queried by different functions as a fallback to when an option was not set. By only parsing them when the package is loaded, it decrease the overhead in functions, and it clarifies that options can be changed at runtime whereas environment variables should only be set at startup.
-
The overhead of initiating futures have been significantly reduced. For example, the roundtrip time for
value(future(NULL))
is about twice as fast for 'sequential', 'cluster', and 'multisession' futures. For 'multicore' futures the roundtrip speedup is about 20%. The speedup comes from pre-compiling the R expression that will be used to resolve the future expression into R expression templates which then can quickly compiled for each future. This speeds up the creation of these expression by ~10 times, compared when re-compiling them each time. -
The default timeout for
resolved()
was decreased from 0.20 to 0.01 seconds for cluster/multisession and multicore futures, which means they will spend less time waiting for results when they are not available.
-
Analogously to how globals may be scanned for "non-exportable" objects when option
future.globals.onReference
is set to"error"
or"warning"
,value()
will now check for similar problems in the resolved value object. An example of this isf <- future(xml2::read_xml("<body></body>"))
, which will result in an invalidxml_document
object if run in parallel, because such objects cannot be transferred between R processes. -
In addition to specify which condition classes to be captured and relayed, it is now possible to also specify condition classes to be ignored. For example,
conditions = structure("condition", exclude = "message")
captures all conditions but message conditions. -
Now cluster futures use
homogeneous = NULL
as the default instead ofhomogeneous = TRUE
. The new default will result in the parallelly package trying to infer whether TRUE or FALSE should be used based on theworkers
argument. -
Now the the post-mortem analysis report of multicore and cluster futures in case their results could not be retrieved include information on globals and their sizes, and if some of them are non-exportable. A similar, detailed report is also produced when a cluster future fails to set up and launch itself on a parallel worker.
-
if option
future.fork.multithreading.enable
is FALSE, RcppParallel, in addition to OpenMP, is forced to run with a single threaded whenever running in a forked process (='multicore' futures). This is done by setting environment variableRCPP_PARALLEL_NUM_THREADS
to 1. -
Add
futureSessionInfo()
to get a quick overview of the future framework, its current setup, and to run simple tests on it. -
Now
plan(multicore)
warns immediately if multicore processing, that is, forked processing, is not supported, e.g. when running in the RStudio Console.
-
plan(multiprocess, workers = n)
did not warn about 'multiprocess' being deprecated when argumentworkers
was specified. -
getGlobalsAndPackages()
could throw a false error on "Did you mean to create the future within a function? Invalid future expression tries to use global...
variables that do not exist: " when...
is solely part of a formula or used in some S4 generic functions. -
When enabled, option
future.globals.onReference
could falsely alert on 'Detected a non-exportable reference (externalptr) in one of the globals (<unknown>) used in the future expression' in globals, e.g. when using future.apply or furrr map-reduce functions when using a 'multisession' backend. -
future(fcn(), globals = list(a = 42, fcn = function() a))
would fail with "Error in fcn() : object 'a' not found" when using sequential or multicore futures. This affected also map-reduce calls such asfuture.apply::future_lapply(1, function(x) a, future.globals = list(a = 42))
. -
Resolving a 'sequential' future without globals would result in internal several
...future.*
objects being written to the calling environment, which might be the global environment. -
Environment variable
R_FUTURE_PLAN
would propagate down with nested futures, forcing itself onto also nested future plans. Now it is unset in nested futures, resulting in a sequential future strategy unless another was explicitly set byplan()
. -
Transparent futures no longer warn about
local = FALSE
being deprecated. Althoughlocal = FALSE
is being deprecated, it is still used internally by 'transparent' futures for a while longer. Please do not use 'transparent' futures in production code and never in a package. -
remote()
could produce an error on "object 'homogeneous' not found". -
nbrOfFreeWorkers()
for 'cluster' futures assumed that the current plan is set to cluster too.
-
In order to handle them conditionally higher up in the call chain, warnings and errors produced from using the random number generator (RNG) in a future without declaring the intention to use one are now of class
RngFutureWarning
andRngFutureError
, respectively. Both of these classes inherits fromRngFutureCondition
. -
Now run-time errors from resolving a future take precedence over
RngFutureError
:s. That is,future({ rnorm(1); log("a") }, seed = FALSE)
will signal an error 'log("a")' instead of an RNG error when optionfuture.rng.onMisuse
is set to"error"
.
- Add
nbrOfFreeWorkers()
to query how many workers are free to take on futures immediately. Until all third-party future backends have implemented this, some backends might produce an error saying it is not yet supported.
-
future(..., seed = TRUE)
with 'sequential' futures would set the RNG kind of the parent process. Now it behaves the same regardless of future backend. -
Signaling
immediateCondition
:s with 'multicore' could result inError in save_rds(obj, file) : save_rds() failed to rename temporary save file '/tmp/RtmpxNyIyK/progression21f3f31eadc.rds.tmp' (NA bytes; last modified on NA) to '/tmp/RtmpxNyIyK/progression21f3f31eadc.rds' (NA bytes; last modified on NA)
. There was an assertion at the end of the internalsave_rds()
function that incorrectly assumed that the target file should exist. However, the file might have already been processed and removed by the future in the main R session. -
value()
with both a run-time error and an RNG mistake would signal the RNG warning instead of the run-time error when the for-internal-use-only argumentsignal
was set to FALSE. -
Due to a mistake introduced in future 1.20.0, the package would end up assigning a
.packageVersion
object to the global environment when loaded.
future::plan("multisession")
would produce 'Error in if (debug) mdebug("covr::package_coverage() workaround ...") : argument is not interpretable as logical' if and only if the covr package was loaded.
-
Strategy 'multiprocess' is deprecated in favor of either 'multisession' or 'multicore', depending on operating system and R setup. The
plan()
function will give an informative deprecation warning when 'multiprocess' is used. This warning is given only once per R session. -
Launching R or Rscript with command-line option
--parallel=n
, where n > 1, will now use 'multisession' as future strategy. Previously, it would use 'multiprocess', which is now deprecated. -
Support for
local = FALSE
is deprecated. For the time being, it remains supported for 'transparent' futures and 'cluster' futures that usepersistent = TRUE
. However, note thatpersistent = TRUE
will also deprecated at some point in the future. These deprecations are required in order to further standardize the Future API across various types of parallel backends. -
Now multisession workers inherit the package library path from the main R session when they are created, that is, when calling
plan(multisession)
. To avoid this, useplan(multisession, rscript_libs = NULL)
, which is an argument passed down tomakeClusterPSOCK()
. With this update, 'sequential', 'multisession', and 'multicore' futures see the exact same library path. -
Several functions for managing parallel-style processing have been moved to a new parallelly package. Specifically, functions
availableCores()
,availableWorkers()
,supportsMulticore()
,as.cluster()
,autoStopCluster()
,makeClusterMPI()
,makeClusterPSOCK()
, andmakeNodePSOCK()
have been moved. None of them are specific to futures per se and are likely useful elsewhere too. Also, having them in a separate, standalone package will speed up the process of releasing any updates to these functions. The code base of the future package shrunk about 10-15% from this migration. For backward compatibility, the migrated functions remain in this package as re-exports.
-
Setting up a future strategy with argument
split = TRUE
will cause the standard output and non-error conditions to be split ("tee:d") on the worker's end, while still relaying back to the main R session as before. This can be useful when debugging withbrowse()
ordebug()
, e.g.plan(sequential, split = TRUE)
. Without it, debug output is not displayed. -
Now multicore futures relay
immediateCondition
:s in a near-live fashion. -
It is now possible to pass any arguments that
makeClusterPSOCK()
accepts in the call toplan(cluster, ...)
andplan(multisession, ...)
. For instance, to set the working directory of the cluster workers to a temporary folder, pass argumentrscript_startup = "setwd(tempdir())"
. Another example isrscript_libs = c(libs, "*")
to prepend the library path on the worker with the paths inlibs
. -
plan()
andtweak()
check for even more arguments that must not be set by either of them. Specifically, attempts to adjust the following arguments offuture()
will result in an error:conditions
,envir
,globals
,packages
,stdout
, andsubstitute
in addition to already validatedlazy
andseed
. -
tweak()
now returns a wrapper function that calls the original future strategy function with the modified defaults. Previously, it would make a copy of the original function with modified argument defaults. This new approach will make it possible to introduce new future arguments that can be modified bytweak()
andplan()
without having to update every future backend package, e.g. the newsplit = TRUE
argument.
-
Add a 'Best Practices for Package Developers' vignette.
-
Add a 'How the Future Framework is Validated' vignette.
- Harmonizing Future constructor functions to also use
substitute = TRUE
.
-
Since last version, future 1.19.1,
future(..., conditions = character(0L))
would no longer avoid intercepting conditions as intended; instead, it muffles all conditions. From now on, useconditions = NULL
. -
Relaying of
immediateCondition
:s was not near-live for multisession and cluster if the underlying PSOCK cluster useduseXDR=FALSE
for communication. -
print()
for Future would also print any attributes of its environment. -
The error message produced by
nbrOfWorkers()
was incomplete. -
Renamed environment variable
R_FUTURE_MAKENODEPSOCK_tries
used bymakeClusterPSOCK()
toR_FUTURE_MAKENODEPSOCK_TRIES
. -
The Mandelbrot demo would produce random numbers without declaring so.
-
Strategy 'multiprocess' is deprecated in favor of either 'multisession' or 'multicore', depending on operating system and R setup.
-
values()
is deprecated. Usevalue()
instead. -
All backward compatible code for the legacy, defunct, internal
Future
elementvalue
is now removed. Using or relying on it is an error.
- When passing
...
as a globals, rather than via arguments, in higher-level map-reduce APIs such as future.apply and furrr, arguments in...
could produce an error on "unused argument".
-
Futures detect when random number generation (RNG) was used to resolve them. If a future uses RNG without parallel RNG was requested, then an informative warning is produced. To request parallel RNG, specify argument
seed
, e.g.f <- future(rnorm(3), seed = TRUE)
ory %<-% { rnorm(3) } %seed% TRUE
. Higher-level map-reduce APIs provide similarly named "seed" arguments to achieve the same. To, escalate these warning to errors, set optionfuture.rng.onMisuse
to"error"
. To silence them, set it to"ignore"
. -
Now, all non-captured conditions are muffled, if possible. For instance,
future(warning("boom"), conditions = c("message"))
will truly muffle the warning regardless of backend used. This was needed to fix below bug.
-
makeClusterPSOCK()
will now retry to create a cluster node up totries
(default: 3) times before giving up. If argumentport
species more than one port (e.g.port = "random"
) then it will also attempt find a valid random port up totries
times before giving up. The pre-validation of the random port is only supported in R (>= 4.0.0) and skipped otherwise. -
makeClusterPSOCK()
skips shell quoting of the elements inrscript
if it inherits fromAsIs
. -
makeClusterPSOCK()
, or actuallymakeNodePSOCK()
, gained argumentquiet
, which can be used to silence output produced bymanual = TRUE
. -
If multithreading is disabled but multicore futures fail to acknowledge the setting on the current system, then an informative
FutureWarning
is produced by such futures. -
Now
availableCores()
better supports Slurm. Specifically, if environment variableSLURM_CPUS_PER_TASK
is not set, which requires that option--slurm-cpus-per-task=n
is specified andSLURM_JOB_NUM_NODES=1
, then it falls back to usingSLURM_CPUS_ON_NODE
, e.g. when using--ntasks=n
. -
Now
availableCores()
andavailableWorkers()
supports LSF/OpenLava. Specifically, they acknowledge environment variableLSB_DJOB_NUMPROC
andLSB_HOSTS
, respectively.
- Now
plan(multisession)
,plan(cluster, workers = <number>)
, andmakeClusterPSOCK()
which they both use internally, sets up localhost workers twice as fast compared to versions since future 1.12.0, which brings it back to par with a bare-boneparallel::makeCluster(..., setup_strategy = "sequential")
setup. The slowdown was introduced in future 1.12.0 (2019-03-07) when protection against leaving stray R processes behind from failed worker startup was implemented. This protection now makes use of memoization for speedup.
-
Sequential and multicore backends, but not multisession, would produce errors on "'...' used in an incorrect context" in cases where
...
was part of argumentglobals
and not the evaluation environment. -
Contrary to other future backends, any conditions produced while resolving a sequential future using
future(..., conditions = character())
would be signaled, although the most reasonable expectation would be that they are silenced. Now, all non-captured conditions are muffled, if possible. -
Option
future.rng.onMisuse
was not passed down to nested futures. -
Disabling multithreading in forked processes by setting R option
future.fork.multithreading.enable
or environment variableR_FUTURE_FORK_MULTITHREADING_ENABLE
toFALSE
would cause multicore futures to always return value1L
. This bug was introduced in future 1.17.0 (2020-04-17). -
getGlobalsAndPackages()
did not always return aglobals
element that was of classFutureGlobals
. -
getGlobalsAndPackages(..., globals)
would recalculatetotal_size
even when it was already calculated or known to be zero. -
getGlobalsAndPackages(Formula::Formula(~ x))
would produce "the condition has length > 1" warnings (which will become errors in future R versions).
- Support for
persistent = TRUE
with multisession futures is deprecated.
-
print()
onRichSOCKcluster
gives information not only on the name of the host but also on the version of R and the platform of each node ("worker"), e.g. "Socket cluster with 3 nodes where 2 nodes are on host 'localhost' (R version 4.0.0 (2020-04-24), platform x86_64-w64-mingw32), 1 node is on host 'n3' (R version 3.6.3 (2020-02-29), platform x86_64-pc-linux-gnu)". -
Error messages from cluster future failures are now more informative than "Unexpected result (of class 'NULL' != 'FutureResult')". For example, if the future package is not installed on the worker, then the error message clearly says so. Even, if there is an unexpected result error from a PSOCK cluster future, then the error produced give extra information on node where it failed, e.g. "Unexpected result (of class 'NULL' != 'FutureResult') retrieved for ClusterFuture future (label = '', expression = '...'): This suggests that the communication with
ClusterFuture
worker ('RichSOCKnode' #1 on host 'n3' (R version 3.6.3 (2020-02-29), platform x86_64-pc-linux-gnu)) is out of sync." -
It is now possible to set environment variables on workers before they are launched by
makeClusterPSOCK()
by specify them as as"<name>=<value>"
as part of therscript
vector argument, e.g.rscript = c("ABC=123", "DEF='hello world'", "Rscript")
. This works because elements inrscript
that match regular expression[[:alpha:]_][[:alnum:]_]*=.*
are no longer shell quoted. -
makeClusterPSOCK()
now returns a cluster that in addition to inheriting fromSOCKcluster
it will also inherit fromRichSOCKcluster
.
-
Made
makeClusterPSOCK()
andmakeNodePSOCK()
agile to the name change fromparallel:::.slaveRSOCK()
toparallel:::.workRSOCK()
in R (>= 4.1.0). -
makeClusterPSOCK(..., rscript)
will not try to locaterscript[1]
if argumenthomogeneous
is FALSE (or inferred to be FALSE). -
makeClusterPSOCK(..., rscript_envs)
would result in a syntax error when starting the workers due to non-ASCII quotation marks if optionuseFancyQuotes
was not set to FALSE. -
plan(list(...))
would produce 'Error in UseMethod("tweak") : no applicable method for 'tweak' applied to an object of class "list"' if a non-function object named 'list' was on the search path. -
plan(x$abc)
with x <- list(abc = sequential) would produce 'Error in UseMethod("tweak") : no applicable method for 'tweak' applied to an object of class "c('FutureStrategyList', 'list')"'. -
TESTS:
R_FUTURE_FORK_ENABLE=false R CMD check ...
would produce 'Error: connections left open: ...' when checking the 'multiprocess' example.
-
Support for
persistent = TRUE
with multisession futures is deprecated. If still needed, a temporary workaround is to use cluster futures. However, it is likely that support forpersistent
will eventually be deprecated for all future backends. -
Options
future.globals.method
,future.globals.onMissing
, andfuture.globals.resolve
are deprecated and produce warnings if set. They may only be used for troubleshooting purposes because they may affect how futures are evaluated, which means that reproducibility cannot be guaranteed elsewhere.
- Renamed
values()
tovalue()
to clean up and simplify the API.
-
makeClusterPSOCK()
gained argumentrscript_envs
for setting environment variables in workers on startup, e.g.rscript_envs = c(FOO = "3.14", "BAR")
. -
Now the result of a future holds session details in case an error occurred while evaluating the future.
- Not all CRAN servers have
_R_CHECK_LIMIT_CORES_
set. To better emulate CRAN submission checks, the future package will, when loaded, set this environment variable to 'TRUE' if unset and ifR CMD check
is running. Note thatfuture::availableCores()
respects_R_CHECK_LIMIT_CORES_
and returns at most2L
(two cores) if detected.
-
Any globals named
version
andhas_future
would be overwritten with "garbage" values internally. -
Disabling of multi-threading when using 'multicore' futures did not work on all platforms.
- All
values()
S3 methods have been renamed tovalue()
since they are closely related to the original purposevalue()
. Thevalues()
methods will continue to work but will soon be formally deprecated and later be made defunct and finally be removed. Please replace allvalues()
withvalue()
calls.
- Now
oplan <- plan(new_strategy)
returns the list of all nested strategies previously set, instead of just the strategy on top of this stack. This makes it easier to temporarily use another plan. For the old behavior, useoplan <- plan(new_strategy)[[1]]
.
-
Now
value()
detects if afuture(..., seed = FALSE)
call generated random numbers, which then might give unreliable results because non-parallel safe, non-statistically sound random number generation (RNG) was used. If optionfuture.rng.onMisuse
is"warning"
, a warning is produced. If"error"
, an error is produced. If"ignore"
(default), the mistake is silently ignored. Usingseed = NULL
is likeseed = FALSE
but without performing the RNG validation. -
For convenience, argument
seed
offuture()
may now also be an ordinary single integer random seed. If so, a L'Ecuyer-CMRG RNG seed is created from this seed. Ifseed = TRUE
, then a L'Ecuyer-CMRG RNG seed based on the current RNG state is used. Useseed = FALSE
when it is known that the future does not use RNG. -
ClusterFuture
:s now relayimmediateCondition
:s back to the main process momentarily after they are signaled and before the future is resolved.
- Add support for automatically disable multi-threading when using
'multicore' futures. For now, the default is to allow
multi-threaded processing but this might change in the future. To
disable multi-threaded, set option
future.fork.multithreading.enable
or environment variableR_FUTURE_FORK_MULTITHREADING_ENABLE
toFALSE
. This requires that RhpcBLASctl package is installed. Parallelization via multi-threaded processing (done in native code by some packages and externally library) while at the same time using forked (aka "multicore") parallel processing is unstable in some cases. Note that this is not only true when usingplan(multicore)
but also when using, for instance,parallel::mclapply()
. This is in beta so the above names and options might change later.
-
Evaluation of futures could fail if the global environment contained functions with the same names as a small set of base R functions, e.g.
raw()
,list()
, andoptions()
. -
future(alist(a =))
would produce "Error in objectSize_list(x, depth = depth - 1L) : argument "x_kk" is missing, with no default"
-
Future
andFutureResult
objects with an internal version 1.7 or older have been deprecated since 1.14.0 (July 2019) and are now defunct. -
Defunct hidden argument
progress
ofresolve()
, and hidden arguments/fieldscondition
andcalls
ofFutureResult
are now gone.
- The default range of ports that
makeClusterPSOCK()
draws a random port from (when argumentport
is not specified) can now be controlled by environment variableR_FUTURE_RANDOM_PORTS
. The default range is still11000:11999
as with the parallel package.
- The change introduced to
resolved()
in future 1.15.0 would cause lazy futures to block if all workers were occupied.
resolved()
will now launch lazy futures.
-
Now the "visibility" of future values is recorded and reflected by
value()
. -
Now option
future.globals.onReference
defaults to environment variableR_FUTURE_GLOBALS_ONREFERENCE
.
- Added 'Troubleshooting' section to
?makeClusterPSOCK
with instructions on how to troubleshoot when the setup of local and remote clusters fail.
-
values()
would resignalimmediateCondition
:s despite those should only be signaled at most once per future. -
makeClusterPSOCK()
could produce warnings like "cannot open file '/tmp/alice/Rtmpi69yYF/future.parent=2622.a3e32bc6af7.pid': No such file", e.g. when launching R workers running in Docker containers. -
Package would set or update the RNG state of R (
.Random.seed
) when loaded, which could affect RNG reproducibility. -
Package could set
.Random.seed
to NULL, instead of removing it, which in turn would produce a warning on "'.Random.seed' is not an integer vector but of type 'NULL', so ignored" when the next random number generated. -
Now a future assignment to list environments produce more informative error messages if attempting to assign to more than one element.
-
makeClusterMPI()
did not work for MPI clusters withcomm
other than1
.
-
Argument
value
ofresolve()
is deprecated. Useresult
instead. -
Use of internal argument
evaluator
tofuture()
is now defunct.
-
All types of conditions are now captured and relayed. Previously, only conditions of class
message
andwarning
were relayed. -
If one of the futures in a collection produces an error, then
values()
will signal that error as soon as it is detected. This means that while callingvalues()
guarantees to resolve all futures, it does not guarantee that the result from all futures are gathered back to the master R session before the error is relayed.
-
values()
now relaysstdout
and signal as soon as possible as long as the standard output and the conditions are relayed in their original order. -
If a captured condition can be "muffled", then it will be muffled. This helps to prevent conditions from being handled twice by condition handlers when futures are evaluated in the main R session, e.g.
plan(sequential)
. Messages and warnings were already muffled in the past. -
Forked processing is considered unstable when running R from certain environments, such as the RStudio environment. Because of this, 'multicore' futures have been disabled in those cases since future 1.13.0. This change caught several RStudio users by surprise. Starting with future 1.14.0, an informative one-time-per-session warning will be produced when attempts to use 'multicore' is made in non-supported environments such as RStudio. This warning will also be produced when using 'multiprocess', which will fall back to using 'multisession' futures. The warning can be disabled by setting R option
future.supportsMulticore.unstable
, or environment variableFUTURE_SUPPORTSMULTICORE_UNSTABLE
to"quiet"
. -
Now option
future.startup.script
falls back to environment variableR_FUTURE_STARTUP_SCRIPT
. -
Conditions inheriting
immediateCondition
are signaled as soon as possible. Contrary to other types of conditions, these will be signaled only once per future, despite being collected.
-
Early signaling did not take place for
resolved()
forClusterFuture
andMulticoreFuture
. -
When early signaling was enabled, functions such as
resolved()
andresolve()
would relay captured conditions multiple times. This would, for instance, result in the same messages and warnings being outputted more than once. Now it is onlyvalue()
that will resignal conditions. -
The validation of connections failed to detect when the connection had been serialized (= a
NIL
external pointer) on some macOS systems.
-
Argument
progress
ofresolve()
is now defunct (was deprecated since future 1.12.0). Optionfuture.progress
is ignored. This will make room for other progress-update mechanisms that are in the works. -
Usage of internal argument
evaluator
tofuture()
is now deprecated. -
Removed defunct argument
output
fromFutureError()
. -
FutureResult
fields/argumentscondition
andcalls
are now defunct. Useconditions
instead. -
Future
andFutureResult
objects with an internal version 1.7 or older are deprecated and will eventually become defunct. Future backends that implement their ownFuture
classes should update to implement aresult()
method instead of avalue()
method for theirFuture
classes. All future backends available on CRAN and Bioconductor have already been updated accordingly.
- Forked processing is now disabled by default when running R via
RStudio When disabled, 'multicore' futures fall back to a
'sequential' futures. This update follows from an RStudio
recommendation against using forked parallel processing from
within RStudio because it is likely to break the RStudio R session.
See
help("supportsMulticore")
for more details, e.g. how to re-enable process forking. Note that parallelization via 'multisession' is unaffected and will still work as before. Also, when forked processing is disabled, or otherwise not supported, usingplan("multiprocess")
will fall back to using 'multisession' futures.
-
Forked processing can be disabled by setting R option
future.fork.enable
to FALSE (or environment variableR_FUTURE_FORK_ENABLE=false
). When disabled, 'multicore' futures fall back to a 'sequential' futures even if the operating system supports process forking. If set of TRUE, 'multicore' will not fall back to 'sequential'. If NA, or not set (the default), a set of best-practices rules will decide whether forking is enabled or not. Seehelp("supportsMulticore")
for more details. -
Now
availableCores()
also recognizes PBS environment variableNCPUS
, because the PBSPro scheduler does not setPBS_NUM_PPN
. -
If, option
future.availableCores.custom
is set to a function, thenavailableCores()
will call that function and interpret its value as number of cores. Analogously, optionfuture.availableWorkers.custom
can be used to specify a hostnames of a set of workers thatavailableWorkers()
sees. These new options provide a mechanism for anyone to customizeavailableCores()
andavailableWorkers()
in case they do not (yet) recognize, say, environment variables that are specific the user's compute environment or HPC scheduler. -
makeClusterPSOCK()
gained support for argumentrscript_startup
for evaluating one or more R expressions in the background R worker prior to the worker event loop launching. This provides a more convenient approach than having to use, say,rscript_args = c("-e", sQuote(code))
. -
makeClusterPSOCK()
gained support for argumentrscript_libs
to control the R package library search path on the workers. For example, to prepend the folder~/R-libs
on the workers, userscript_libs = c("~/R-libs", "*")
, where"*"
will be resolved to the current.libPaths()
on the workers. -
Debug messages are now prepended with a timestamp.
- Add vignette on 'Non-Exportable Objects' (extracted from another vignette).
makeClusterPSOCK()
did not shell quote the Rscript executable when running its pre-tests checking whether localhost Rscript processes can be killed by their PIDs or not.
- Argument
value
ofresolve()
has been renamed toresult
to better reflect that not only values are collected when this argument is used. Argumentvalue
still works for backward compatibility, but will eventually be formally deprecated and then defunct.
-
If
makeClusterPSOCK()
fails to create one of many nodes, then it will attempt to stop any nodes that were successfully created. This lowers the risk for leaving R worker processes behind. -
Future results now hold the timestamps when the evaluation of the future started and finished.
-
Functions no longer produce "partial match of 'condition' to 'conditions'" warnings with
options(warnPartialMatchDollar = TRUE)
. -
When future infix operators (
%conditions%
,%globals%
,%label%
,%lazy%
,%packages%
,%seed%
, and%stdout%
) that are intended for future assignments were used in the wrong context, they would incorrectly be applied to the next future created. Now they're discarded. -
makeClusterPSOCK()
in future (>= 1.11.1) produced warnings when argumentrscript
hadlength(rscript) > 1
. -
Validation of L'Ecuyer-CMRG RNG seeds failed in recent R devel.
-
With
options(OutDec = ",")
, the default value of several argument would resolve toNA_real_
rather than a numeric value resulting in errors such as "is.finite(alpha) is not TRUE".
-
Argument
progress
ofresolve()
is now deprecated. -
Argument
output
ofFutureError()
is now defunct. -
FutureError
no longer inheritssimpleError
.
- When
makeClusterPSOCK()
fails to connect to a worker, it produces an error with detailed information on what could have happened. In rare cases, another error could be produced when generating the information on what the workers PID is.
-
The defaults of several arguments of
makeClusterPSOCK()
andmakeNodePSOCK()
can now be controlled via environment variables in addition to R options that was supported in the past. An advantage of using environment variables is that they will be inherited by child processes, also nested ones. -
The printing of future plans is now less verbose when the
workers
argument is a complex object such as a PSOCK cluster object. Previously, the output would include verbose output of attributes, etc.
- TESTS: When the future package is loaded, it checks whether
R CMD check
is running or not. If it is, then a few future-specific environment variables are adjusted such that the tests play nice with the testing environment. For instance, it sets the socket connection timeout for PSOCK cluster workers to 120 seconds (instead of the default 30 days!). This will lower the risk for more and more zombie worker processes cluttering up the test machine (e.g. CRAN servers) in case a worker process is left behind despite the main R processes is terminated. Note that these adjustments are applied automatically to the checks of any package that depends on, or imports, the future package.
- Whenever
makeClusterPSOCK()
would fail to connect to a worker, for instance due to a port clash, then it would leave the R worker process running - also after the main R process terminated. When the worker is running on the same machine,makeClusterPSOCK()
will now attempt to kill such stray R processes. Note thatparallel::makePSOCKcluster()
still has this problem.
- Message and warning conditions are now captured and relayed by default.
-
The future call stack ("traceback") is now recorded when the evaluation of a future produces an error. Use
backtrace()
on the future to retrieve it. -
Now
futureCall()
defaults toargs = list()
making is easier to call functions that do not take arguments, e.g.futureCall(function() 42)
. -
plan()
gained argument.skip = FALSE
. When TRUE, setting the same future strategy as already set will be skipped, e.g. callingplan(multisession)
consecutively will have the same effect as calling it just once. -
makeClusterPSOCK()
produces more informative error messages whenever the setup of R workers fails. Also, its verbose messages are now prefixed with[local output]
to help distinguish the output produced by the current R session from that produced by background workers. -
It is now possible to specify what type of SSH clients
makeClusterPSOCK()
automatically searches for and in what order, e.g.rshcmd = c("<rstudio-ssh>", "<putty-plink>")
. -
Now
makeClusterPSOCK()
preserves the global RNG state (.Random.seed
) also when it draws a random port number. -
makeClusterPSOCK()
gained argumentrshlogfile
. -
Cluster futures provide more informative error messages when the communication with the worker node is out of sync.
-
Argument
stdout
was forced to TRUE when using single-core multicore or single-core multisession futures. -
When evaluated in a local environment,
futureCall(..., globals = "a")
would set the value of globala
to NULL, regardless if it exists or not and what its true value is. -
makeClusterPSOCK(..., rscript = "my_r")
would in some cases fail to find the intendedmy_r
executable. -
ROBUSTNESS: A cluster future, including a multisession one, could retrieve results from the wrong workers if a new set of cluster workers had been set up after the future was created/launched but before the results were retrieved. This could happen because connections in R are indexed solely by integers which are recycled when old connections are closed and new ones are created. Now cluster futures assert that the connections to the workers are valid, and if not, an informative error message is produced.
-
Calling
result()
on a non-resolvedUniprocessFuture
would signal evaluation errors.
- Removed defunct
future::future_lapply()
. Please use the one in the future.apply package instead.
-
Add support for manually specifying globals in addition to those that are automatically identified via argument
globals
or%globals%
. Two examples areglobals = structure(TRUE, add = list(a = 42L, b = 3.14))
andglobals = structure(TRUE, add = c("a", "b"))
. Analogously, attributeignore
can be used to exclude automatically identified globals. -
The error reported when failing to retrieve the results of a future evaluated on a localhost cluster/multisession worker or a forked/multicore worker is now more informative. Specifically, it mentions whether the worker process is still alive or not.
-
Add
makeClusterMPI(n)
for creating MPI-based clusters of a similar kind asparallel::makeCluster(n, type = "MPI")
but that also attempts to workaround issues whereparallel::stopCluster()
causes R to stall. -
makeClusterPSOCK()
andmakeClusterMPI()
gained argumentautoStop
for controlling whether the cluster should be automatically stopped when garbage collected or not. -
BETA: Now
resolved()
forClusterFuture
is non-blocking also for clusters of typeMPIcluster
as created byparallel::makeCluster(..., type = "MPI")
.
- On Windows,
plan(multiprocess)
would not initiate the workers. Instead workers would be set up only when the first future was created.
- Standard output is now captured and re-outputted when
value()
is called. This new behavior can be controlled by the argumentstdout
tofuture()
or by specifying the%stdout%
operator if a future assignment is used.
-
R option
width
is passed down so that standard output is captured consistently across workers and consistently with the master process. -
Now more
future.*
options are passed down so that they are also acknowledged when using nested futures.
-
Add vignette on 'Outputting Text'.
-
CLEANUP: Only the core parts of the API are now listed in the help index. This was done to clarify the Future API. Help for non-core parts are still via cross references in the indexed API as well via
help()
.
-
When using forced, nested 'multicore' parallel processing, such as,
plan(list(tweak(multicore, workers = 2), tweak(multicore, workers = 2)))
, then the child process would attempt to resolve futures owned by the parent process resulting in an error (on 'bad error message'). -
When using
plan(multicore)
, if a forked worker would terminate unexpectedly, it could corrupt the master R session such that any further attempts of using forked workers would fail. A forked worker could be terminated this way if the user pressed Ctrl-C (the worker receives aSIGINT
signal). -
makeClusterPSOCK()
produced a warning when environment variableR_PARALLEL_PORT
was set torandom
(e.g. as on CRAN). -
Printing a
plan()
could produce an error when the deparsed call used to set up theplan()
was longer than 60 characters.
-
future::future_lapply()
is defunct (gives an error if called). Please use the one in the future.apply package instead. -
Argument
output
ofFutureError()
is formally deprecated. -
Removed all
FutureEvaluationCondition
classes and related methods.
-
getGlobalsAndPackages()
gained argumentmaxSize
. -
makeClusterPSOCK()
now produces a more informative warning if environment variableR_PARALLEL_PORT
specifies a non-numeric port. -
Now
plan()
gives a more informative error message in case it fails, e.g. when the internal future validation fails and why. -
Added
UnexpectedFutureResultError
to be used by backends for signaling in a standard way that an unexpected result was retrieved from a worker.
-
When the communication between an asynchronous future and a background R process failed, further querying of the future state/results could end up in an infinite waiting loop. Now the failed communication error is recorded and re-signaled if any further querying attempts.
-
Internal, seldom used
myExternalIP()
failed to recognize IPv4 answers from some of the lookup servers. This could in turn produce another error. -
In R (>= 3.5.0), multicore futures would produce multiple warnings originating from querying whether background processes have completed or not. These warnings are now suppressed.
- Errors produces when evaluating futures are now (re-)signaled on the master R process as-is with the original content and class attributes.
-
More errors related to orchestration of futures are of class
FutureError
to make it easier to distinguish them from future evaluation errors. -
Add support for a richer set of results returned by resolved futures. Previously only the value of the future expression, which could be a captured error to be resignaled, was expected. Now a
FutureResult
object may be returned instead. Although not supported in this release, this update opens up for reporting on additional information from the evaluation of futures, e.g. captured output, timing and memory benchmarks, etc. Before that can take place, existing future backend packages will have to be updated accordingly. -
backtrace()
returns only the last call that produced the error. It is unfortunately not possible to capture the call stack that led up to the error when evaluating a future expression.
-
value()
forMulticoreFuture
would not produce an error when a (forked) background R workers would terminate before the future expression is resolved. This was a limitation inherited from the parallel package. Now an informativeFutureError
message is produced. -
value()
forMulticoreFuture
would not signal errors unless they inherited fromsimpleError
- now it's enough for them to inherits fromerror
. -
value()
forClusterFuture
no longer produces aFutureEvaluationError
, butFutureError
, if the connection to the R worker has changed (which happens if something as drastic ascloseAllConnections()
have been called.) -
futureCall(..., globals = FALSE)
would produce "Error: second argument must be a list", because the explicit arguments where not exported. This could also happen when specifying globals by name or as a named list. -
Nested futures were too conservative in requiring global variables to exist, even when they were false positives.
-
future::future_lapply()
is formally deprecated. Please use the one in the future.apply package instead. -
Recently introduced
FutureEvaluationCondition
classes are deprecated, because they no longer serve a purpose since future evaluation conditions are now signaled as-is.
future_lapply()
has moved to the future.apply package available on CRAN.
-
Argument
workers
of future strategies may now also be a function, which is called without argument when the future strategy is set up and used as is. For instance,plan(multiprocess, workers = halfCores)
wherehalfCores <- function() { max(1, round(
availableCores()/ 2)) }
will use half of the number of available cores. This is useful when using nested future strategies with remote machines. -
On Windows,
makeClusterPSOCK()
, and thereforeplan(multisession)
andplan(multiprocess)
, will use the SSH client distributed with RStudio as a fallback if neitherssh
norplink
is available on the systemPATH
. -
Now
plan()
makes sure thatnbrOfWorkers()
will work for the new strategy. This will help catch mistakes such asplan(cluster, workers = cl)
wherecl
is a basic R list rather than acluster
list early on. -
Added
%packages%
to explicitly control packages to be attached when a future is resolved, e.g.y %<-% { YT[2] } %packages% "data.table"
. Note, this is only needed in cases where the automatic identification of global and package dependencies is not sufficient. -
Added condition classes
FutureCondition
,FutureMessage
,FutureWarning
, andFutureError
representing conditions that occur while a future is setup, launched, queried, or retrieved. They do not represent conditions that occur while evaluating the future expression. For those conditions, new classesFutureEvaluationCondition
,FutureEvaulationMessage
,FutureEvaluationWarning
, andFutureEvaluationError
exists.
- Vignette 'Common Issues with Solutions' now documents the case
where the future framework fails to identify a variable as being
global because it is only so conditionally, e.g.
if (runif(1) < 1/2) x <- 0; y <- 2 * x
.
- Added mechanism for detecting globals that may not be exportable
to an external R process (a "worker"). Typically, globals that
carry connections and external pointers (
externalptr
) can not be exported, but there are exceptions. By setting optionsfuture.globals.onReference
to"warning"
, a warning is produced informing the user about potential problems. If"error"
, an error is produced. Because there might be false positive, the default is"ignore"
, which will cause above scans to be skipped. If there are non-exportable globals and these tests are skipped, a run-time error may be produced only when the future expression is evaluated.
-
The total size of global variables was overestimated, and dramatically so if defined in the global environment and there were are large objects there too. This would sometimes result in a false error saying that the total size is larger than the allowed limit.
-
An assignment such as
x <- x + 1
where the left-hand side (LHS)x
is a global failed to identifyx
as a global because the right-hand side (RHS)x
would override it as a local variable. Updates to the globals package fixed this problem. -
makeClusterPSOCK(..., renice = 19)
would launch each PSOCK worker vianice +19
resulting in the error "nice: '+19': No such file or directory". This bug was inherited fromparallel::makePSOCKcluster()
. Now usingnice --adjustment=19
instead. -
Protection against passing future objects to other futures did not work for future strategy 'multicore'.
-
future_lapply()
has moved to the new future.apply package available on CRAN. Thefuture::future_lapply()
function will soon be deprecated, then defunct, and eventually be removed from the future package. Please update your code to make use offuture.apply::future_lapply()
instead. -
Dropped defunct 'eager' and 'lazy' futures; use 'sequential' instead.
-
Dropped defunct arguments
cluster
andmaxCores
; useworkers
instead. -
In previous version of the future package the
FutureError
class was used to represent both orchestration errors (nowFutureError
) and evaluation errors (nowFutureEvaluationError
). Any usage of classFutureError
for the latter type of errors is deprecated and should be updated toFutureEvaluationError
.
-
Now
plan()
accepts also strings such as"future::cluster"
. -
Now
backtrace(x[[ER]])
works also for non-environmentx
:s, e.g. lists.
-
When measuring the size of globals by scanning their content, for certain types of classes the inferred lengths of these objects were incorrect causing internal subset out-of-range issues.
-
print()
forFuture
would output one global per line instead of concatenating the information with commas.
- Now exporting
getGlobalsAndPackages()
.
future_lapply()
would give "Error in objectSize.env(x, depth = depth - 1L): object 'nnn' not found" when for instance 'nnn' is part of an unresolved expression that is an argument value.
- FIX: Some of the package assertion tests made too precise assumptions about the object sizes, which fails with the introduction of ALTREP in R-devel which causes the R's SEXP header size to change.
- Now
tweak()
, and henceplan()
, generates a more informative error message if a non-future function is specified by mistake, e.g. callingplan(cluster)
with the survival package attached after future is equivalent to callingplan(survival::cluster)
whenplan(future::cluster)
was intended.
nbrOfWorkers()
gave an error withplan(remote)
. Fixed by making the 'remote' future inheritcluster
(as it should).
- TESTS: No longer testing forced termination of forked cluster
workers when running on Solaris. The termination was done by
launching a future that called
quit()
, but that appeared to have corrupted the main R session when running on Solaris.
-
Formally defunct 'eager' and 'lazy' futures; use 'sequential' instead.
-
Dropped previously defunct
%<=%
and%=>%
operators.
- Multicore and multisession futures no longer reserve one core for the main R process, which was done to lower the risk for producing a higher CPU load than the number of cores available for the R session.
-
makeClusterPSOCK()
now defaults to use the Windows PuTTY software's SSH clientplink -ssh
, ifssh
is not found. -
Argument
homogeneous
ofmakeNodePSOCK()
, a helper function ofmakeClusterPSOCK()
, will default to FALSE also if the hostname is a fully qualified domain name (FQDN), that is, it "contains periods". For instance,c('node1', 'node2.server.org')
will usehomogeneous = TRUE
for the first worker andhomogeneous = FALSE
for the second. -
makeClusterPSOCK()
now asserts that each cluster node is functioning by retrieving and recording the node's session information including the process ID of the corresponding R process. -
Nested futures sets option
mc.cores
to prevent spawning of recursive parallel processes by mistake. Because 'mc.cores' controls additional processes, it was previously set to zero. However, since some functions such asmclapply()
does not support that, it is now set to one instead.
- Help on
makeClusterPSOCK()
gained more detailed descriptions on arguments and what their defaults are.
- Formally deprecated eager futures; use sequential instead.
-
future_lapply()
with multicore / multisession futures, would use a suboptimal workload balancing where it split up the data in one chunk too many. This is no longer a problem because of how argumentworkers
is now defined for those type of futures (see note on top). -
future_lapply()
, as well as lazy multicore and lazy sequential futures, did not respect optionfuture.globals.resolve
, but was hardcoded to always resolve globals (future.globals.resolve = TRUE
). -
When globals larger than the allowed size (option
future.globals.maxSize
) are detected an informative error message is generated. Previous version introduced a bug causing the error to produce another error. -
Lazy sequential futures would produce an error when resolved if required packages had been detached.
-
print()
would not display globals gathered for lazy sequential futures.
-
Added package tests for globals part of formulas part of other globals, e.g.
purrr::map(x, ~ rnorm(.))
, which requires globals (>= 0.10.0). -
Now package tests with
parallel::makeCluster()
not only test fortype = "PSOCK"
clusters but also"FORK"
(when supported). -
TESTS: Cleaned up test scripts such that the overall processing time for the tests was roughly halved, while preserving the same test coverage.
- The default for
future_lapply()
is now to not generate RNG seeds (future.seed = FALSE
). If proper random number generation is needed, usefuture.seed = TRUE
. For more details, see help page.
-
future()
andfuture_lapply()
gained argumentpackages
for explicitly specifying packages to be attached when the futures are evaluated. Note that the default throughout the future package is that all globals and all required packages are automatically identified and gathered, so in most cases those do not have to be specified manually. -
The default values for arguments
connectTimeout
andtimeout
ofmakeNodePSOCK()
can now be controlled via global options.
-
Now
future_lapply()
guarantees that the RNG state of the calling R process after returning is updated compared to what it was before and in the exact same way regardless offuture.seed
(except FALSE),future.scheduling
and future strategy used. This is done in order to guarantee that an R script callingfuture_lapply()
multiple times should be numerically reproducible given the same initial seed. -
It is now possible to specify a pre-generated sequence of
.Random.seed
seeds to be used for eachFUN(x[[i]], ...)
call infuture_lapply(x, FUN, ...)
.
future_lapply()
scans global variables for non-resolved futures (to resolve them) and calculate their total size once. Previously, each chunk (a future) would redo this.
-
Now
future_lapply(X, FUN, ...)
identifies global objects amongX
,FUN
and...
recursively until no new globals are found. Previously, only the first level of globals were scanned. This is mostly thanks to a bug fix in globals 0.9.0. -
A future that used a global object
x
of a class that overrideslength()
would produce an error iflength(x)
reports more elements than what can be subsetted. -
nbrOfWorkers()
gave an error withplan(cluster, workers = cl)
wherecl
is acluster
object created byparallel::makeCluster()
, etc. This prevented for instancefuture_lapply()
to work with such setups. -
plan(cluster, workers = cl)
wherecl <- makeCluster(..., type = MPI")
would give an instant error due to an invalid internal assertion.
-
Previously deprecated arguments
maxCores
andcluster
are now defunct. -
Previously deprecated assignment operators
%<=%
and%=>%
are now defunct. -
availableCores(method = "mc.cores")
is now defunct in favor of"mc.cores+1"
.
- Where applicable, workers are now initiated when calling
plan()
, e.g.plan(cluster)
will set up workers on all cluster nodes. Previously, this only happened when the first future was created.
-
Renamed 'eager' futures to 'sequential', e.g.
plan(sequential)
. The 'eager' futures will be deprecated in an upcoming release. -
Added support for controlling whether a future is resolved eagerly or lazily when creating the future, e.g.
future(..., lazy = TRUE)
,futureAssign(..., lazy = TRUE)
, andx %<-% { ... } %lazy% TRUE
. -
future()
,futureAssign()
andfutureCall()
gained argumentseed
, which specifies a L'Ecuyer-CMRG random seed to be used by the future. The seed for future assignment can be specified via%seed%
. -
futureAssign()
now passes all additional arguments tofuture()
. -
Added
future_lapply()
which supports load balancing ("chunking") and perfect reproducibility (regardless of type of load balancing and how futures are resolved) via initial random seed. -
Added
availableWorkers()
. By default it returns localhost workers according toavailableCores()
. In addition, it detects common HPC allocations given in environment variables set by the HPC scheduler. -
The default for
plan(cluster)
is nowworkers = availableWorkers()
. -
Now
plan()
stops any clusters that were implicitly created. For instance, a multisession cluster created byplan(multisession)
will be stopped whenplan(eager)
is called. -
makeClusterPSOCK()
treats workers that refer to a local machine by its local or canonical hostname as "localhost". This avoids having to launch such workers over SSH, which may not be supported on all systems / compute cluster. -
Option
future.debug = TRUE
also reports on total size of globals identified and for cluster futures also the size of the individual global variables exported. -
Option
future.wait.timeout
(replacesfuture.wait.times
) specifies the maximum waiting time for a free workers (e.g. a core or a compute node) before generating a timeout error. -
Option
future.availableCores.fallback
, which defaults to environment variableR_FUTURE_AVAILABLECORES_FALLBACK
can now be used to specify the default number of cores / workers returned byavailableCores()
andavailableWorkers()
when no other settings are available. For instance, ifR_FUTURE_AVAILABLECORES_FALLBACK=1
is set system wide in an HPC environment, then all R processes that usesavailableCores()
to detect how many cores can be used will run as single-core processes. Without this fallback setting, and without other core-specifying settings, the default will be to use all cores on the machine, which does not play well on multi-user systems.
- Globals part of locally defined functions are now also identified thanks to globals (>= 0.8.0) updates.
- Lazy futures and
plan(lazy)
are now deprecated. Instead, useplan(eager)
and thenf <- future(..., lazy = TRUE)
orx %<-% { ... } %lazy% TRUE
. The reason behind this is that in some cases code that uses futures only works under eager evaluation (lazy = FALSE
; the default), or vice verse. By removing the "lazy" future strategy, the user can no longer override thelazy = TRUE / FALSE
that the developer is using.
-
Creation of cluster futures (including multisession ones) would time out already after 40 seconds if all workers were busy. New default timeout is 30 days (option
future.wait.timeout
). -
nbrOfWorkers()
gave an error forplan(cluster, workers)
whereworkers
was a character vector or acluster
object of the parallel package. Because of this,future_lapply()
gave an error with such setups. -
availableCores(methods = "_R_CHECK_LIMIT_CORES_")
would give an error if not runningR CMD check
.
-
Added
makeClusterPSOCK()
- a version ofparallel::makePSOCKcluster()
that allows for more flexible control of how PSOCK cluster workers are set up and how they are launched and communicated with if running on external machines. -
Added generic
as.cluster()
for coercing objects to cluster objects to be used as inplan(cluster, workers = as.cluster(x))
. Also added ac()
implementation for cluster objects such that multiple cluster objects can be combined into a single one. -
Added
sessionDetails()
for gathering details of the current R session. -
plan()
andplan("list")
now prints more user-friendly output. -
On Unix, internal
myInternalIP()
tries more alternatives for finding the local IP number.
%<=%
is deprecated. Use%<-%
instead. Same for%=>%
.
-
values()
for lists and list environments of futures where one or more of the futures resolved to NULL would give an error. -
value()
forClusterFuture
would give cryptic error message "Error in stop(ex) : bad error message" if the cluster worker had crashed / terminated. Now it will instead give an error message like "Failed to retrieve the value ofClusterFuture
from cluster node #1 on 'localhost'. The reason reported was "error reading from connection". -
Argument
user
toremote()
was ignored (since 1.1.0).
- For the special case where 'remote' futures use
workers = "localhost"
they (again) use the exact same R executable as the main / calling R session (in all other cases it uses whateverRscript
is found in thePATH
). This was already indeed implemented in 1.0.1, but with the added support for reverse SSH tunnels in 1.1.0 this default behavior was lost.
-
REMOTE CLUSTERS: It is now very simple to use
cluster()
andremote()
to connect to remote clusters / machines. As long as you can connect via SSH to those machines, it works also with these future. The new code completely avoids incoming firewall and incoming port forwarding issues previously needed. This is done by using reverse SSH tunneling. There is also no need to worry about internal or external IP numbers. -
Added optional argument
label
to all futures, e.g.f <- future(42, label = "answer")
andv %<-% { 42 } %label% "answer"
. -
Added argument
user
tocluster()
andremote()
. -
Now all
Future
classes supportsrun()
for launching the future andvalue()
callsrun()
if the future has not been launched. -
MEMORY: Now
plan(cluster, gc = TRUE)
causes the background R session to be garbage collected immediately after the value is collected. Since multisession and remote futures are special cases of cluster futures, the same is true for these as well. -
ROBUSTNESS: Now the default future strategy is explicitly set when no strategies are set, e.g. when used nested futures. Previously, only mc.cores was set so that only a single core was used, but now also
plan("default")
set. -
WORKAROUND:
resolved()
on cluster futures would block on Linux until future was resolved. This is due to a bug in R. The workaround is to use round the timeout (in seconds) to an integer, which seems to always work / be respected.
-
Global variables part of subassignments in future expressions are recognized and exported (iff found), e.g.
x$a <- value
,x[["a"]] <- value
, andx[1,2,3] <- value
. -
Global variables part of formulae in future expressions are recognized and exported (iff found), e.g.
y ~ x | z
. -
As an alternative to the default automatic identification of globals, it is now also possible to explicitly specify them either by their names (as a character vector) or by their names and values (as a named list), e.g.
f <- future({ 2*a }, globals = c("a"))
orf <- future({ 2*a }, globals = list(a = 42))
. For future assignments one can use the%globals%
operator, e.g.y %<-% { 2*a } %globals% c("a")
.
- Added vignette on command-line options and other methods for controlling the default type of futures to use.
-
ROBUSTNESS: For the special case where 'remote' futures use
workers = "localhost"
they now use the exact same R executable as the main / calling R session (in all other cases it uses whateverRscript
is found in thePATH
). -
FutureError
now extendssimpleError
and no longer the error class of captured errors.
- Adding section to vignette on globals in formulas describing how they are currently not automatically detected and how to explicitly export them.
-
Since future 0.13.0, a global
pkg
would be overwritten by the name of the last package attached in future. -
Futures that generated
R.oo::Exception
errors, they triggered another internal error.
-
Add support for
remote(..., myip = "<external>")
, which now queries a set of external lookup services in case one of them fails. -
Add
mandelbrot()
function used in demo to the API for convenience. -
ROBUSTNESS: If
.future.R
script, which is sourced when the future package is attached, gives an error, then the error is ignored with a warning. -
TROUBLESHOOTING: If the future requires attachment of packages, then each namespace is loaded separately and before attaching the package. This is done in order to see the actual error message in case there is a problem while loading the namespace. With
require()
/library()
this error message is otherwise suppressed and replaced with a generic one.
-
Falsely identified global variables no longer generate an error when the future is created. Instead, we leave it to R and the evaluation of the individual futures to throw an error if the a global variable is truly missing. This was done in order to automatically handle future expressions that use non-standard evaluation (NSE), e.g.
subset(df, x < 3)
wherex
is falsely identified as a global variable. -
Dropped support for system environment variable
R_FUTURE_GLOBALS_MAXSIZE
.
-
DEMO: Now the Mandelbrot demo tiles a single Mandelbrot region with one future per tile. This better illustrates parallelism.
-
Documented R options used by the future package.
-
Custom futures based on a constructor function that is defined outside a package gave an error.
-
plan("default")
assumed that thefuture.plan
option was a string; gave an error if it was a function. -
Various future options were not passed on to futures.
-
A startup
.future.R
script is no longer sourced if the future package is attached by a future expression.
-
Added remote futures, which are cluster futures with convenient default arguments for simple remote access to R, e.g.
plan(remote, workers = "login.my-server.org")
. -
Now
.future.R
(if found in the current directory or otherwise in the user's home directory) is sourced when the future package is attach (but not loaded). This helps separating scripts from configuration of futures. -
Added support for
plan(cluster, workers = c("n1", "n2", "n2", "n4"))
, whereworkers
(also forClusterFuture()
) is a set of host names passed toparallel::makeCluster(workers)
. It can also be the number of localhost workers. -
Added command line option
--parallel=<p>
, which is long for-p <p>
. -
Now command line option
-p <p>
also set the default future strategy to multiprocessing (if p >= 2 and eager otherwise), unless another strategy is already specified via optionfuture.plan
or system environment variableR_FUTURE_PLAN
. -
Now
availableCores()
also acknowledges environment variableNSLOTS
set by Sun/Oracle Grid Engine (SGE). -
MEMORY: Added argument
gc = FALSE
to all futures. When TRUE, the garbage collector will run at the very end in the process that evaluated the future (just before returning the value). This may help lowering the overall memory footprint when running multiple parallel R processes. The user can enable this by specifyingplan(multiprocess, gc = TRUE)
. The developer can control this usingfuture(expr, gc = TRUE)
orv %<-% { expr } %tweak% list(gc = TRUE)
.
- Significantly decreased the overhead of creating a future, particularly multicore futures.
- Future would give an error with
plan(list("eager"))
, whereas it did work withplan("eager")
andplan(list(eager))
.
-
Added
nbrOfWorkers()
. -
Added informative
print()
method for theFuture
class. -
values()
passes arguments...
tovalue()
of each future. -
Added
FutureError
class.
- Renamed arguments
maxCores
andcluster
toworkers
. If using the old argument names a deprecation warning will be generated, but it will still work until made defunct in a future release.
resolve()
for lists and environments did not work properly when the set of futures was not resolved in order, which could happen with asynchronous futures.
-
Add support to
plan()
for specifying different future strategies for the different levels of nested futures. -
Add
backtrace()
for listing the trace the expressions evaluated (the calls made) before a condition was caught. -
Add transparent futures, which are eager futures with early signaling of conditioned enabled and whose expression is evaluated in the calling environment. This makes the evaluation of such futures as similar as possible to how R evaluates expressions, which in turn simplifies troubleshooting errors, etc.
-
Add support for early signaling of conditions. The default is (as before) to signal conditions when the value is queried. In addition, they may be signals as soon as possible, e.g. when checking whether a future is resolved or not.
-
Signaling of conditions when calling
value()
is now controlled by argumentsignal
(previouslyonError
). -
Now
UniprocessFuture
:s captures the call stack for errors occurring while resolving futures. -
ClusterFuture()
gained argumentpersistent = FALSE
. Withpersistent = TRUE
, any objects in the cluster R session that was created during the evaluation of a previous future is available for succeeding futures that are evaluated in the same session. Moreover, globals are still identified and exported but "missing" globals will not give an error - instead it is assumed such globals are available in the environment where the future is evaluated. -
OVERHEAD: Utility functions exported by
ClusterFuture
are now much smaller; previously they would export all of the package environment.
-
f <- multicore(NA, maxCores = 2)
would end up in an endless waiting loop for a free core ifavailableCores()
returned one. -
ClusterFuture()
would ignorelocal = TRUE
.
-
Added multiprocess futures, which are multicore futures if supported, otherwise multisession futures. This makes it possible to use
plan(multiprocess)
everywhere regardless of operating system. -
Future strategy functions gained class attributes such that it is possible to test what type of future is currently used, e.g.
inherits(plan(), "multicore")
. -
ROBUSTNESS: It is only the R process that created a future that can resolve it. If a non-resolved future is queried by another R process, then an informative error is generated explaining that this is not possible.
-
ROBUSTNESS: Now
value()
for multicore futures detects if the underlying forked R process was terminated before completing and if so generates an informative error messages.
- Adjusted the parameters for the schema used to wait for next available cluster node such that nodes are polled more frequently.
-
resolve()
gained argumentrecursive
. -
Added option
future.globals.resolve
for controlling whether global variables should be resolved for futures or not. If TRUE, then globals are searched recursively for any futures and if found such "global" futures are resolved. If FALSE, global futures are not located, but if they are later trying to be resolved by the parent future, then an informative error message is generated clarifying that only the R process that created the future can resolve it. The default is currently FALSE.
-
FIX: Exports of objects available in packages already attached by the future were still exported.
-
FIX: Now
availableCores()
returns3L
(=2L+1L
) instead of2L
if_R_CHECK_LIMIT_CORES_
is set.
-
Add multisession futures, which analogously to multicore ones, use multiple cores on the local machine with the difference that they are evaluated in separate R session running in the background rather than separate forked R processes. A multisession future is a special type of cluster futures that do not require explicit setup of cluster nodes.
-
Add support for cluster futures, which can make use of a cluster of nodes created by
parallel::makeCluster()
. -
Add
futureCall()
, which is for futures whatdo.call()
is otherwise. -
Standardized how options are named, i.e.
future.<option>
. If you used any future options previously, make sure to check they follow the above format.
- All futures now validates globals by default (
globals = TRUE
).
-
Now
%<=%
can also assign to multi-dimensional list environments. -
Add
futures()
,values()
andresolved()
. -
Add
resolve()
to resolve futures in lists and environments. -
Now
availableCores()
also acknowledges the number of CPUs allotted by Slurm. -
CLEANUP: Now the internal future variable created by
%<=%
is removed when the future variable is resolved.
futureOf(envir = x)
did not work properly whenx
was a list environment.
-
ROBUSTNESS: Now values of environment variables are trimmed before being parsed.
-
ROBUSTNESS: Add reproducibility test for random number generation using Pierre L'Ecuyer's RNG stream regardless of how futures are evaluated, e.g. eager, lazy and multicore.
- Now globals ("unknown" variables) are identified using the new
findGlobals(..., method = "ordered")
in globals (> 0.5.0) such that a global variable preceding a local variable with the same name is properly identified and exported/frozen.
- Updated vignette on common issues with the case where a global variable is not identified because it is hidden by an element assignment in the future expression.
- Errors occurring in multicore futures could prevent further multicore futures from being created.
-
Globals that were copies of package objects were not exported to the future environments.
-
The future package had to be attached or
future::future()
had to be imported, if%<=%
was used internally in another package. Similarly, it also had to be attached if multicore futures where used.
- Added vignette 'Futures in R: Common issues with solutions'.
eager()
andmulticore()
gained argumentglobals
, whereglobals = TRUE
will validate that all global variables identified can be located already before the future is created. This provides the means for providing the same tests on global variables with eager and multicore futures as with lazy futures.
lazy(sum(x, ...), globals = TRUE)
now properly passes...
from the function from which the future is setup. If not called within a function or called within a function without...
arguments, an informative error message is thrown.
-
plan("default")
resets to the default strategy, which is synchronous eager evaluation unless optionfuture_plan
or environment variableR_FUTURE_PLAN
has been set. -
availableCores("mc.cores")
returnsgetOption("mc.cores") + 1L
, because optionmc.cores
specifies "allowed number of additional R processes" to be used in addition to the main R process.
plan(future::lazy)
and similar gave errors.
-
multicore()
gained argumentmaxCores
, which makes it possible to use for instanceplan(multicore, maxCores = 4L)
. -
Add
availableMulticore()
[from (in-house) async package].
- More colorful
demo("mandelbrot", package = "future")
.
-
ROBUSTNESS:
multicore()
blocks until one of the CPU cores is available, iff all are currently occupied by other multicore futures. -
old <- plan(new)
now returns the old plan/strategy (was the newly set one).
- Add multicore futures, which are futures that are resolved asynchronously in a separate process. These are only supported on Unix-like systems, but not on Windows.
-
Eager and lazy futures now records the result internally such that the expression is only evaluated once, even if their error values are requested multiple times.
-
Eager futures are always created regardless of error or not.
-
All
Future
objects are environments themselves that record the expression, the call environment and optional variables.
lazy()
"freezes" global variables at the time when the future is created. This way the result of a lazy future is more likely to be the same as an 'eager' future. This is also how globals are likely to be handled by asynchronous futures.
plan()
records the call.
- Added
demo("mandelbrot", package = "future")
, which can be re-used by other future packages.
-
Added
plan()
. -
Added eager future - useful for troubleshooting.
- Distilled Future API from (in-house) async package.