Unexpected Symbol in As Formula, Can't Find

11,599

Solution 1

The <text>:2:10080 is giving you the location of the error. 2nd line, 10080th character. Consider:

parse(text="1 + 1 + 2\n a - 3 b")
# Error in parse(text = "1 + 1 + 2\n a - 3 b") : 
#   <text>:2:8: unexpected symbol

Here, the error is with b, which is an illegal use of a symbol, and you'll note it is the 8th character of the second line.

Most likely you're missing a +, though no way of knowing without the data behind your error. Also, not to judge or anything, but that's a helluva lot variables to be sticking into a model. I hope you have lots of data points.

Solution 2

Here is what worked for me as a work around for this problem.

features = make.names(features)

right_side = paste0(features, collapse=" + ")

fml = as.formula(sprintf(" ~ %s", right_side))
Share:
11,599
riders994
Author by

riders994

Python and R based Data Scientist

Updated on June 05, 2022

Comments

  • riders994
    riders994 almost 2 years

    I've been using as.formula for setting up a glm, and I can't figure out where the unexpected symbol is. Part of the problem is that the character vector I'm converting is so long. It's about 700 words with + inserted in between in order to turn it into a formula. The error presents as follows:

    Error in parse(text = x, keep.source = FALSE) : 
       <text>:2:10080: unexpected symbol
    

    with the following snippet of the text:

    2: c_1_E + Campaign_Search_Payroll_Generic_1_P + Campaign_Search_Performing_Core_Keywords + Campaign_Self_Employment_E + Campaign_Self_Employment_P + Campaign_Withholding + Campaign_Youtube + Sou
    

    Things I know for sure:

    1. No item is repeated.
    2. No symbols other than alphanumerics and underscore (_).
    3. No item starts with a number.

    I'm not well versed enough in R to understand reading the documentation for as.formula or the function call itself.

    Any ideas?

  • riders994
    riders994 almost 9 years
    I don't have much experience working with substrings, but is there a function to specify what the characters are in that area of the formula? I'm guessing the first line of the call is "Dependent_Variable ~", and the second line is the rest of the formula. The other main problem is there are only 16,000 characters in the text to begin with, so I have no idea how to find the symbol.
  • riders994
    riders994 almost 9 years
    I think that's going to be the best way to go about it. The 100080 error is why I came here first. I was wondering if there was some other solution. The interesting part is that this isn't even the longest vector I've done this with.
  • riders994
    riders994 almost 9 years
    Found it! I forgot to remove spaces in one of the sections.
  • Friedrich
    Friedrich over 2 years
    make.names() worked like a charm