How can I change the encoding of a text file that is delimited by pipe and quotes so I can read it into R?

11,114

You can easily import a pipe delimited .txt file this way:

file_in <- read.table("C:/example.txt", sep = "|")

That applies for any character separated text files, just change the sep to suit.

Share:
11,114
user3302483
Author by

user3302483

Updated on June 04, 2022

Comments

  • user3302483
    user3302483 almost 2 years

    I want to read data from a text file into an R dataframe. The data is delimited by pipes | and also has quotes around the values. I've tried some combinations of read.table but it's importing everything into a single field as opposed to splitting it. The data looks like this:

    "CompetitorDataID"|"CompetitorID"|"ItemID"|"UserID"|"CountryID"|"SegmentID"|"TaskID"|"Price"|"Comment"|"CreateDate"|"GeneralCustomer"|"TenderResult"
    "29"|"5"|"187630"|"1375"|"5"|"398"|"4085"|"5.000000"|"test"|"2013-01-1002:58:23.230000000"|"False"|"1"
    "30"|"5"|"1341"|"1294"|"5"|"398"|"4088"|"6.000000"|"test"|"2013-01-1003:15:26.687000000"|"False"|"1"
    "31"|"5"|"1007"|"1375"|"5"|"398"|"4105"|"5.000000"|""|"2013-01-1005:50:51.150000000"|"False"|"1"
    

    Although this code will import when pasted into R it won't work from the original text file. I get the following error message:

    Warning messages:
    1: In read.table("competitorDataCopy.txt", header = TRUE, sep = "|") :
      line 1 appears to contain embedded nulls
    2: In read.table("competitorDataCopy.txt", header = TRUE, sep = "|") :
      line 2 appears to contain embedded nulls
    3: In read.table("competitorDataCopy.txt", header = TRUE, sep = "|") :
      line 3 appears to contain embedded nulls
    4: In read.table("competitorDataCopy.txt", header = TRUE, sep = "|") :
      line 4 appears to contain embedded nulls
    5: In read.table("competitorDataCopy.txt", header = TRUE, sep = "|") :
      line 5 appears to contain embedded nulls
    6: In read.table("competitorDataCopy.txt", header = TRUE, sep = "|") :
      line 1 appears to contain embedded nulls
    7: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
      embedded nul(s) found in input