str_replace_all not working in pipeline

11,209

Edit: It appears @akrun has deleted his answer, so it is reproduced below. I made the (perhaps faulty) assumption you wished to transform the whole dataframe, and that the dataframe was appropriately formatted. These issues are avoided with clearer questions, and sample data.

It's a little hard to tell with no clue as to the values of your global variables and no data (you can generate fake data, btw, as long as it presents the same issue), but my guess is below.

When piping, the previous result is piped in as the first argument of the next function. You can see this in the error message: Error in str_replace_all(., string, pattern, replacement) -- the ., shows the piped in argument. Here, the first argument is "string". Therefore, the piped in result is being used as string, "string" is used as pattern, "pattern" is being used for replacement, and the "" you put in for replacement is left as an unused argument, causing your error.

Might help to use str_replace_all(pattern, replacement), or specify arguments: str_replace_all(pattern = pattern, replacement = replacement)

Ex.

data <- as.data.frame(matrix(ncol=2, nrow=2))
data$V1 <- c("  NA", "foo")
data$V2 <- c("bar", "boo")
data %>%
    str_replace_all("oo", "xx")

If you only wish to transform one column (from @akrun): Simply use mutate to create a new column based off of the preexisting column. If you wish to replace the column, give it the same name:

Ex.

data <- as.data.frame(matrix(ncol=2, nrow=2))
data$V1 <- c("  NA", "foo")
data$V2 <- c("bar", "boo")
data
    V1  V2
1   NA bar
2  foo boo

#new column
data %>%
    mutate(new = str_replace_all(V1, "oo", "xx"))

    V1  V2  new
1   NA bar   NA
2  foo boo  fxx

#column replacement
data %>%
    mutate(V1 = str_replace_all(V1, "oo", "xx"))

    V1  V2
1   NA bar
2  fxx boo
Share:
11,209

Related videos on Youtube

MokeEire
Author by

MokeEire

Technologies: R, Python, SQL, STATA Interests: Behavioural economics, public policy, dogs

Updated on July 05, 2022

Comments

  • MokeEire
    MokeEire almost 2 years

    Here's my code:

    df <- df %>%
      filter(conditions x, y, and z) %>%
      str_replace_all(string, pattern, replacement)
    

    When doing this, I got the error:

    Error in str_replace_all(., string, pattern, replacement) :
      unused argument("")
    

    I know the code is not at all useful in terms of replication, as I've said before, I can't share the data, but assume the input was correct (I have since gotten it to work by mutating the variable instead). The replacement was an empty string, but that ought not to matter as far as I know.

    I'm just curious why str_replace_all does not work in a pipeline, anyone have any insight?

    • akrun
      akrun over 6 years
      YOu may need mutate i.e. %>% mutate(col = str_replace_all(string, pattern, replacement)) what is pattern and replacement BTW. Also, it is not clear filter(conditions x, y, and z) is this a pseudocode?
  • aku
    aku over 6 years
    It's not about the order but the superfluous "string" argument. I edited my answer, hopefully it's clearer now whether it's correct or not. It is definitely hard to tell given the unclear question though.
  • aku
    aku over 6 years
    So df %>% str_replace_all( "[aeiou]", toupper) still worked for me, I just assumed based on their code that MoikeEire didn't want a new column(s), but the whole dataframe transformed. If they only wanted select columns transformed, then I agree, mutate would definitely be more appropriate.
  • MokeEire
    MokeEire over 6 years
    Yes, I figured that the piped result may be taking the position of the string argument. I think this is the issue. I had tried dropping the column selection but this then returned a character vector of the whole dataframe. I would need to pipe one column into the function for it to work then I'm assuming? There is no way to use str_replace_all on one column in a pipeline unless I select that sole column?
  • aku
    aku over 6 years
    I'm not completely sure what you're trying to do--transform one column and return just that, transform the whole dataframe, or transform one column and return the whole dataframe. In any of those cases, my edited answer should help make things clearer.
  • MokeEire
    MokeEire over 6 years
    I want to transform one column and return the whole dataframe; as I said in the inital post I did use mutate to get the desired result. I was more curious why str_replace_all would not do the same thing
  • aku
    aku over 6 years
    Honestly, I'm not sure why you would assume it DID do the same thing. If you haven't selected a particular column, how would it know to use that column? And it isn't meant to return anything more than the transformed result. As for why it wasn't written to do these things, that's likely because mutate() (as well as other, less efficient processes) already exists and is more widely applicable. But who knows, ask Hadley :P