Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anonymous inline functions #56

Closed
ymer opened this issue Oct 24, 2023 · 3 comments
Closed

Anonymous inline functions #56

ymer opened this issue Oct 24, 2023 · 3 comments

Comments

@ymer
Copy link

ymer commented Oct 24, 2023

This code works at extracting the number:

df = DataFrame(ID1 = ["ID_1", "ID_23", "ID_456"])
extract_id(s) = split(s, "_")[end]
df = @mutate(df, ID = extract_id.(ID1))

But I would like to be able to do it without the helper function. However, this does not work:

df = @mutate(df, ID = (s -> split(s, "_")[end]).(ID1))
@kdpsingh
Copy link
Member

I'll take a look at this! Thank you for posting.

@kdpsingh
Copy link
Member

The challenge here is that TidierData thinks that s refers to a column name in the DataFrame. This is one of the downsides of using bare column names to refer to columns. One thing on the roadmap is to implement some dynamic scoping, whereby TidierData will "detect" that there is no column called s and will rewrite the code in a way that will make this example work.

For now, the best way to work with anonymous functions in TidierData is to use across():

julia> df = DataFrame(ID1 = ["ID_1", "ID_23", "ID_456"])
3×1 DataFrame
 Row │ ID1    
     │ String 
─────┼────────
   1 │ ID_1
   2 │ ID_23
   3 │ ID_456

julia> df = @mutate(df, across(ID1, s -> [split.(s, "_")[end] for s in s]))
3×2 DataFrame
 Row │ ID1     ID1_function 
     │ String  SubString   
─────┼──────────────────────
   1 │ ID_1    1
   2 │ ID_23   23
   3 │ ID_456  456

@kdpsingh
Copy link
Member

Here's another approach that works:

julia> df = @chain df begin 
                @mutate(across(ID1, s -> split.(s, "_")))
                @unnest_wider(ID1_function)
            end
3×3 DataFrame
 Row │ ID1     ID1_function1  ID1_function2 
     │ String  SubString     SubString    
─────┼──────────────────────────────────────
   1 │ ID_1    ID             1
   2 │ ID_23   ID             23
   3 │ ID_456  ID             456

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants