Importing data with special characters in R
Your problem is an encoding issue. There are two aspects to this: First, what is saved by Notepad++ may not correspond to the encoding that you are expecting in the saved text file, and second, R may be reading the file in using read.csv()
based on a different encoding, which is especially possible since if you are using Notepad++ then this suggests you are using Windows, and therefore you may be unable to have UTF-8 as your system locale for R.
So taking each issue in turn:
-
Getting Notepad++ to save your file in a specific encoding. Here you can set your encoding for the new file based using these instructions. I always use UTF-8 but here since your texts are Danish, Latin-1 should work too.
To verify the encoding of your texts, you may wish to use the
file
utility supplied with RTools. This will tell you something about the probable encoding of your file from the command line, although it is not perfect. (OS X and Linux users already have this without needing to install additional utilities.) -
Setting encoding when importing the .csv file into R. When you import the file using
read.csv()
, specifyencoding = "UTF-8"
orencoding = "Latin-1"
. You might also want to check though what your system encoding is, and match that. You can do this withSys.getlocale()
(and set it withSys.setlocale()
.) On my system for instance:> Sys.getlocale() [1] "en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8"
You could of course set this to Windows-1252 but you might have trouble then with portability if using this on other platforms. UTF-8 is the best solution to this.
Mpizos Dimitris
Updated on June 05, 2022Comments
-
Mpizos Dimitris almost 2 years
The following pic shows how the data is before i import it(notepad) in R and after importing.
I use the following command to import it in R:
Data <- read.csv('data.csv',stringsAsFactors = FALSE,header = TRUE,quote = "")
It can be seen that the special characters such as the ae is replaced with something like A| (line 19 on the left,line 18 or the right). Is there a way to import the CSV file as it is? (Using R)
-
xilliam about 2 yearsThis does not really answer the question. If you have a different question, you can ask it by clicking Ask Question. To get notified when this question gets new answers, you can follow this question. Once you have enough reputation, you can also add a bounty to draw more attention to this question. - From Review