{lightning} allows the user to generate deviates from different distributions.
{lightning} is a personal learning project. I wanted to learn more about Scala invocations using {sparklyr} and chose generating random deviates from distributions using the RandomRDDs singleton object.
You can install:
- the development version from GitHub with
# install.packages("remotes")
remotes::install_github("nathaneastwood/lightning")
{lightning} provides two methods for generating random variates.
Firstly, we can generate N values from a Distribution
class:
sc <- sparklyr::spark_connect(master = "local")
library(lightning)
norm <- Normal$new(sc = sc, size = 10L, num_partitions = 1L, seed = 1L)
norm$count()
# [1] 10
norm$collect()
# [1] -0.7364418 1.1537268 0.4631666 1.7794325 0.3503825 -1.2078423
# [7] 0.1825577 -0.2811541 0.1794811 -1.4066039
norm$first()
# [1] -0.7364418
norm$get_num_partitions()
# [1] 1
Secondly we can generate single values from a Generator
class:
norm_gen <- NormalGenerator$new(sc = sc)
norm_gen$set_seed(1L)
norm_gen$next_value()
# [1] -1.032273
- Allow the ability to
map()
the distributions. - Convert the RandomRDDs to a Spark
DataFrame
.
jozefhajnala and yitao-li for their help on this topic.