Combining (cbind) vectors of different length

48,267

Solution 1

You can use indexing, if you index a number beyond the size of the object it returns NA. This works for any arbitrary number of rows defined with foo:

nm <- list(1:8,3:8,1:5)

foo <- 8

sapply(nm, '[', 1:foo)

EDIT:

Or in one line using the largest vector as number of rows:

sapply(nm, '[', seq(max(sapply(nm,length))))

From R 3.2.0 you may use lengths ("get the length of each element of a list") instead of sapply(nm, length):

sapply(nm, '[', seq(max(lengths(nm))))

Solution 2

You should fill vectors with NA before calling do.call.

nm <- list(1:8,3:8,1:5)

max_length <- max(unlist(lapply(nm,length)))
nm_filled <- lapply(nm,function(x) {ans <- rep(NA,length=max_length);
                                    ans[1:length(x)]<- x;
                                    return(ans)})
do.call(cbind,nm_filled)

Solution 3

This is a shorter version of Wojciech's solution.

nm <- list(1:8,3:8,1:5)
max_length <- max(sapply(nm,length))
sapply(nm, function(x){
    c(x, rep(NA, max_length - length(x)))
})

Solution 4

Here is an option using stri_list2matrix from stringi

library(stringi)
out <- stri_list2matrix(nm)
class(out) <- 'numeric'
out
#      [,1] [,2] [,3]
#[1,]    1    3    1
#[2,]    2    4    2
#[3,]    3    5    3
#[4,]    4    6    4
#[5,]    5    7    5
#[6,]    6    8   NA
#[7,]    7   NA   NA
#[8,]    8   NA   NA

Solution 5

Late to the party but you could use cbind.fill from rowr package with fill = NA

library(rowr)
do.call(cbind.fill, c(nm, fill = NA))

#  object object object
#1      1      3      1
#2      2      4      2
#3      3      5      3
#4      4      6      4
#5      5      7      5
#6      6      8     NA
#7      7     NA     NA
#8      8     NA     NA

If you have a named list instead and want to maintain the headers you could use setNames

nm <- list(a = 1:8, b = 3:8, c = 1:5)
setNames(do.call(cbind.fill, c(nm, fill = NA)), names(nm))

#  a  b  c
#1 1  3  1
#2 2  4  2
#3 3  5  3
#4 4  6  4
#5 5  7  5
#6 6  8 NA
#7 7 NA NA
#8 8 NA NA
Share:
48,267
Nick
Author by

Nick

Updated on July 19, 2022

Comments

  • Nick
    Nick almost 2 years

    I have several vectors of unequal length and I would like to cbind them. I've put the vectors into a list and I have tried to combine the using do.call(cbind, ...):

    nm <- list(1:8, 3:8, 1:5)
    do.call(cbind, nm)
    
    #      [,1] [,2] [,3]
    # [1,]    1    3    1
    # [2,]    2    4    2
    # [3,]    3    5    3
    # [4,]    4    6    4
    # [5,]    5    7    5
    # [6,]    6    8    1
    # [7,]    7    3    2
    # [8,]    8    4    3
    # Warning message:
    #   In (function (..., deparse.level = 1)  :
    #         number of rows of result is not a multiple of vector length (arg 2)
    

    As expected, the number of rows in the resulting matrix is the length of the longest vector, and the values of the shorter vectors are recycled to make up for the length.

    Instead I'd like to pad the shorter vectors with NA values to obtain the same length as the longest vector. I'd like the matrix to look like this:

    #      [,1] [,2] [,3]
    # [1,]    1    3    1
    # [2,]    2    4    2
    # [3,]    3    5    3
    # [4,]    4    6    4
    # [5,]    5    7    5
    # [6,]    6    8    NA
    # [7,]    7    NA   NA
    # [8,]    8    NA   NA
    

    How can I go about doing this?

  • hadley
    hadley about 13 years
    You are always better off using vapply rather than sapply because that will guarantee you get the output type that you expect.
  • Sacha Epskamp
    Sacha Epskamp over 12 years
    '[' is the name of the operator [ which you use in indexing (foo[1:10]). See also ?'['
  • bshor
    bshor almost 12 years
    The one line solution fails if the first column is shorter than the other two.
  • guerda
    guerda about 9 years
    @hadley Could you elaborate on your comment? I don't understand the difference between vapply and sapply transferred to this problem.
  • hadley
    hadley about 9 years
    sapply is dangerous to program with because it is not type stable - depending on the length of nm you'll get different types
  • SeGa
    SeGa about 5 years
    The only answer that keeps column name is from @Ronak Shah using the rowr package. Is there an alternative with base R that keeps column names?