How to import excel data into R using column name and row name

34,866

Solution 1

@Andrie mentionned the XLConnect package, it's a very useful package for I/O between R and Excel with the possibility to select region in Excel sheet.

I created an Excel file like yours in my Dropbox public folder, you can download the example.xls file here.

require(XLConnect)

## A5:C5 correspond to (D,Y) (D,Z) (E,Y) (E,Z)  in your example
selectworksheet1 <- readWorksheetFromFile("/home/ahmadou/Dropbox/Public/example.xls",
                               sheet = "Worksheet1", 
                               region = "A5:C5", header = FALSE)

selectworksheet1
##  Col0 Col1 Col2
## 1    2    2    2


## B4:C5 correspond to (D,Y1) (D,Z1) (E,Y1) (E,Z1) in the second example
selectworksheet2 <- readWorksheetFromFile("/home/ahmadou/Dropbox/Public/example.xls",
                         sheet = "Worksheet2", 
                         region = "B4:C5", header = FALSE)

selectworksheet2
##   Col0 Col1
## 1    1    1
## 2    1    1

unlist(selectworksheet2)
## Col01 Col02 Col11 Col12 
##    1     1     1     1 

Solution 2

There are several packages which provide functions to import Excel data to R; see the R data import/export documentation.

I've found the xlsx package to be useful (it will read both .xls and .xlsx files). I don't believe that it will accept row/column names as input, but it will accept their numerical value (row 1, column 4 for example). In your case, something like this should work, assuming that X, Y and Z correspond to columns 1-3:

library(xlsx)
# first example subset; call it ss1
# assume first row is not a header; otherwise requires header = T
ss1 <- read.xlsx("myfile.xlsx", sheetIndex = 1, rowIndex = 4:5, colIndex = 2:3)

# second example subset; call it ss2
# just the same except worksheet index = 2
ss2 <- read.xlsx("myfile.xlsx", sheetIndex = 2, rowIndex = 4:5, colIndex = 2:3)

However, you will need to experiment with your own file until things work as expected. You can also specify sheetName but I find sheetIndex normally works more reliably, once you figure out the correct index for each sheet. And take care if the first row is a header.

Having said all that: my preferred option would be to export the sheet to a text format such as CSV, use shell tools (cut, head, tail etc.) to get the required rows/columns and import that to R.

Share:
34,866
barryq
Author by

barryq

Updated on July 09, 2022

Comments

  • barryq
    barryq almost 2 years

    I am a R novice and was wondering how to import excel data into R using row names and column names. Specifically i require a subset of the data in a number of worksheet within one excel file. Can i use row names and column names to identify and extract certain cells of data to R ?

    Worksheet 1
    ----------
    * X Y Z 
    A 1 2 2
    B 1 1 1
    C 1 3 4
    D 4 2 2
    E 2 2 2 
    ----------
    Worksheet 2
    ----------
    *  X Y1 Z1 
    A 1  2  2
    B 1  2  3
    C 1  3  4
    D 4  1  1
    E 2  1  1 
    

    For example in the above spreadsheet how could i extract the data (2,2,2,2) using the row and column names (D,Y) (D,Z) (E,Y) (E,Z) in worksheet 1

    how could i extract the data (1,1,1,1) using the row and column names (D,Y1) (D,Z1) (E,Y1) (E,Z1) in worksheet 2 ?

    Thanks for any help provided

    Barry