-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
I don't know about others, but my main use for case_when
and case_match
is to selectively replace values in a column. For example:
Outline of the current situation (as I understand it)
df <- tibble(cola = letters[1:5])
df |>
mutate(cola = case_match(cola, c("b", "e") ~ "changed", .default = cola))
This works fine, but I find I regularly wish the .default
value didn't need to be specified (its default is NULL). For example when using as part of a piped series of functions:
df |>
mutate(cola = cola |>
toupper() |>
case_match(c("B", "E") ~ "changed")
)
This will replace all unspecified values with NA, which is not what I want, but it's difficult to pass the piped vector as the .default
value. It can be done by specifying a function, but it's a bit cumbersome:
df |>
mutate(cola = cola |>
toupper() |>
(\(x) case_match(x, c("B", "E") ~ "changed", .default = x))()
)
Suggested solution
My suggestion is a new function case_replace
. The same as case_match
, except without the .default
parameter, always specifying that parameter as the input vector (the use case is similar to base::replace
, but that function requires specifying by index instead of value):
case_replace <- \(x, ...) case_match(.x = x, ..., .default = x)
With that, we can pipe a vector nicely:
df |>
mutate(cola = cola |>
toupper() |>
case_replace(c("B", "E") ~ "changed")
)
I realise I'm nitpicking here, and case_match
is really close to what I need. I've just found this to be a useful thing for me, and if others have similar needs, maybe it would be worth adding as a separate function in dplyr::
. I'd be happy to have a go at contributing a PR if the interest is there.