Append an object to a list in R in amortized constant time, O(1)?
Solution 1
If it's a list of string, just use the c()
function :
R> LL <- list(a="tom", b="dick")
R> c(LL, c="harry")
$a
[1] "tom"
$b
[1] "dick"
$c
[1] "harry"
R> class(LL)
[1] "list"
R>
That works on vectors too, so do I get the bonus points?
Edit (2015-Feb-01): This post is coming up on its fifth birthday. Some kind readers keep repeating any shortcomings with it, so by all means also see some of the comments below. One suggestion for list
types:
newlist <- list(oldlist, list(someobj))
In general, R types can make it hard to have one and just one idiom for all types and uses.
Solution 2
The OP (in the April 2012 updated revision of the question) is interested in knowing if there's a way to add to a list in amortized constant time, such as can be done, for example, with a C++ vector<>
container. The best answer(s?) here so far only show the relative execution times for various solutions given a fixed-size problem, but do not address any of the various solutions' algorithmic efficiency directly. Comments below many of the answers discuss the algorithmic efficiency of some of the solutions, but in every case to date (as of April 2015) they come to the wrong conclusion.
Algorithmic efficiency captures the growth characteristics, either in time (execution time) or space (amount of memory consumed) as a problem size grows. Running a performance test for various solutions given a fixed-size problem does not address the various solutions' growth rate. The OP is interested in knowing if there is a way to append objects to an R list in "amortized constant time". What does that mean? To explain, first let me describe "constant time":
-
Constant or O(1) growth:
If the time required to perform a given task remains the same as the size of the problem doubles, then we say the algorithm exhibits constant time growth, or stated in "Big O" notation, exhibits O(1) time growth. When the OP says "amortized" constant time, he simply means "in the long run"... i.e., if performing a single operation occasionally takes much longer than normal (e.g. if a preallocated buffer is exhausted and occasionally requires resizing to a larger buffer size), as long as the long-term average performance is constant time, we'll still call it O(1).
For comparison, I will also describe "linear time" and "quadratic time":
-
Linear or O(n) growth:
If the time required to perform a given task doubles as the size of the problem doubles, then we say the algorithm exhibits linear time, or O(n) growth.
-
Quadratic or O(n2) growth:
If the time required to perform a given task increases by the square of the problem size, them we say the algorithm exhibits quadratic time, or O(n2) growth.
There are many other efficiency classes of algorithms; I defer to the Wikipedia article for further discussion.
I thank @CronAcronis for his answer, as I am new to R and it was nice to have a fully-constructed block of code for doing a performance analysis of the various solutions presented on this page. I am borrowing his code for my analysis, which I duplicate (wrapped in a function) below:
library(microbenchmark)
### Using environment as a container
lPtrAppend <- function(lstptr, lab, obj) {lstptr[[deparse(substitute(lab))]] <- obj}
### Store list inside new environment
envAppendList <- function(lstptr, obj) {lstptr$list[[length(lstptr$list)+1]] <- obj}
runBenchmark <- function(n) {
microbenchmark(times = 5,
env_with_list_ = {
listptr <- new.env(parent=globalenv())
listptr$list <- NULL
for(i in 1:n) {envAppendList(listptr, i)}
listptr$list
},
c_ = {
a <- list(0)
for(i in 1:n) {a = c(a, list(i))}
},
list_ = {
a <- list(0)
for(i in 1:n) {a <- list(a, list(i))}
},
by_index = {
a <- list(0)
for(i in 1:n) {a[length(a) + 1] <- i}
a
},
append_ = {
a <- list(0)
for(i in 1:n) {a <- append(a, i)}
a
},
env_as_container_ = {
listptr <- new.env(parent=globalenv())
for(i in 1:n) {lPtrAppend(listptr, i, i)}
listptr
}
)
}
The results posted by @CronAcronis definitely seem to suggest that the a <- list(a, list(i))
method is fastest, at least for a problem size of 10000, but the results for a single problem size do not address the growth of the solution. For that, we need to run a minimum of two profiling tests, with differing problem sizes:
> runBenchmark(2e+3)
Unit: microseconds
expr min lq mean median uq max neval
env_with_list_ 8712.146 9138.250 10185.533 10257.678 10761.33 12058.264 5
c_ 13407.657 13413.739 13620.976 13605.696 13790.05 13887.738 5
list_ 854.110 913.407 1064.463 914.167 1301.50 1339.132 5
by_index 11656.866 11705.140 12182.104 11997.446 12741.70 12809.363 5
append_ 15986.712 16817.635 17409.391 17458.502 17480.55 19303.560 5
env_as_container_ 19777.559 20401.702 20589.856 20606.961 20939.56 21223.502 5
> runBenchmark(2e+4)
Unit: milliseconds
expr min lq mean median uq max neval
env_with_list_ 534.955014 550.57150 550.329366 553.5288 553.955246 558.636313 5
c_ 1448.014870 1536.78905 1527.104276 1545.6449 1546.462877 1558.609706 5
list_ 8.746356 8.79615 9.162577 8.8315 9.601226 9.837655 5
by_index 953.989076 1038.47864 1037.859367 1064.3942 1065.291678 1067.143200 5
append_ 1634.151839 1682.94746 1681.948374 1689.7598 1696.198890 1706.683874 5
env_as_container_ 204.134468 205.35348 208.011525 206.4490 208.279580 215.841129 5
>
First of all, a word about the min/lq/mean/median/uq/max values: Since we are performing the exact same task for each of 5 runs, in an ideal world, we could expect that it would take exactly the same amount of time for each run. But the first run is normally biased toward longer times due to the fact that the code we are testing is not yet loaded into the CPU's cache. Following the first run, we would expect the times to be fairly consistent, but occasionally our code may be evicted from the cache due to timer tick interrupts or other hardware interrupts that are unrelated to the code we are testing. By testing the code snippets 5 times, we are allowing the code to be loaded into the cache during the first run and then giving each snippet 4 chances to run to completion without interference from outside events. For this reason, and because we are really running the exact same code under the exact same input conditions each time, we will consider only the 'min' times to be sufficient for the best comparison between the various code options.
Note that I chose to first run with a problem size of 2000 and then 20000, so my problem size increased by a factor of 10 from the first run to the second.
Performance of the list
solution: O(1) (constant time)
Let's first look at the growth of the list
solution, since we can tell right away that it's the fastest solution in both profiling runs: In the first run, it took 854 microseconds (0.854 milliseconds) to perform 2000 "append" tasks. In the second run, it took 8.746 milliseconds to perform 20000 "append" tasks. A naïve observer would say, "Ah, the list
solution exhibits O(n) growth, since as the problem size grew by a factor of ten, so did the time required to execute the test." The problem with that analysis is that what the OP wants is the growth rate of a single object insertion, not the growth rate of the overall problem. Knowing that, it's clear then that the list
solution provides exactly what the OP wants: a method of appending objects to a list in O(1) time.
Performance of the other solutions
None of the other solutions come even close to the speed of the list
solution, but it is informative to examine them anyway:
Most of the other solutions appear to be O(n) in performance. For example, the by_index
solution, a very popular solution based on the frequency with which I find it in other SO posts, took 11.6 milliseconds to append 2000 objects, and 953 milliseconds to append ten times that many objects. The overall problem's time grew by a factor of 100, so a naïve observer might say "Ah, the by_index
solution exhibits O(n2) growth, since as the problem size grew by a factor of ten, the time required to execute the test grew by a factor of 100." As before, this analysis is flawed, since the OP is interested in the growth of a single object insertion. If we divide the overall time growth by the problem's size growth, we find that the time growth of appending objects increased by a factor of only 10, not a factor of 100, which matches the growth of the problem size, so the by_index
solution is O(n). There are no solutions listed which exhibit O(n2) growth for appending a single object.
Solution 3
In the other answers, only the list
approach results in O(1) appends, but it results in a deeply nested list structure, and not a plain single list. I have used the below datastructures, they supports O(1) (amortized) appends, and allow the result to be converted back to a plain list.
expandingList <- function(capacity = 10) {
buffer <- vector('list', capacity)
length <- 0
methods <- list()
methods$double.size <- function() {
buffer <<- c(buffer, vector('list', capacity))
capacity <<- capacity * 2
}
methods$add <- function(val) {
if(length == capacity) {
methods$double.size()
}
length <<- length + 1
buffer[[length]] <<- val
}
methods$as.list <- function() {
b <- buffer[0:length]
return(b)
}
methods
}
and
linkedList <- function() {
head <- list(0)
length <- 0
methods <- list()
methods$add <- function(val) {
length <<- length + 1
head <<- list(head, val)
}
methods$as.list <- function() {
b <- vector('list', length)
h <- head
for(i in length:1) {
b[[i]] <- head[[2]]
head <- head[[1]]
}
return(b)
}
methods
}
Use them as follows:
> l <- expandingList()
> l$add("hello")
> l$add("world")
> l$add(101)
> l$as.list()
[[1]]
[1] "hello"
[[2]]
[1] "world"
[[3]]
[1] 101
These solutions could be expanded into full objects that support al list-related operations by themselves, but that will remain as an exercise for the reader.
Another variant for a named list:
namedExpandingList <- function(capacity = 10) {
buffer <- vector('list', capacity)
names <- character(capacity)
length <- 0
methods <- list()
methods$double.size <- function() {
buffer <<- c(buffer, vector('list', capacity))
names <<- c(names, character(capacity))
capacity <<- capacity * 2
}
methods$add <- function(name, val) {
if(length == capacity) {
methods$double.size()
}
length <<- length + 1
buffer[[length]] <<- val
names[length] <<- name
}
methods$as.list <- function() {
b <- buffer[0:length]
names(b) <- names[0:length]
return(b)
}
methods
}
Benchmarks
Performance comparison using @phonetagger's code (which is based on @Cron Arconis' code). I have also added a better_env_as_container
and changed the env_as_container_
a bit. The original env_as_container_
was broken and doesn't actually store all the numbers.
library(microbenchmark)
lPtrAppend <- function(lstptr, lab, obj) {lstptr[[deparse(lab)]] <- obj}
### Store list inside new environment
envAppendList <- function(lstptr, obj) {lstptr$list[[length(lstptr$list)+1]] <- obj}
env2list <- function(env, len) {
l <- vector('list', len)
for (i in 1:len) {
l[[i]] <- env[[as.character(i)]]
}
l
}
envl2list <- function(env, len) {
l <- vector('list', len)
for (i in 1:len) {
l[[i]] <- env[[paste(as.character(i), 'L', sep='')]]
}
l
}
runBenchmark <- function(n) {
microbenchmark(times = 5,
env_with_list_ = {
listptr <- new.env(parent=globalenv())
listptr$list <- NULL
for(i in 1:n) {envAppendList(listptr, i)}
listptr$list
},
c_ = {
a <- list(0)
for(i in 1:n) {a = c(a, list(i))}
},
list_ = {
a <- list(0)
for(i in 1:n) {a <- list(a, list(i))}
},
by_index = {
a <- list(0)
for(i in 1:n) {a[length(a) + 1] <- i}
a
},
append_ = {
a <- list(0)
for(i in 1:n) {a <- append(a, i)}
a
},
env_as_container_ = {
listptr <- new.env(hash=TRUE, parent=globalenv())
for(i in 1:n) {lPtrAppend(listptr, i, i)}
envl2list(listptr, n)
},
better_env_as_container = {
env <- new.env(hash=TRUE, parent=globalenv())
for(i in 1:n) env[[as.character(i)]] <- i
env2list(env, n)
},
linkedList = {
a <- linkedList()
for(i in 1:n) { a$add(i) }
a$as.list()
},
inlineLinkedList = {
a <- list()
for(i in 1:n) { a <- list(a, i) }
b <- vector('list', n)
head <- a
for(i in n:1) {
b[[i]] <- head[[2]]
head <- head[[1]]
}
},
expandingList = {
a <- expandingList()
for(i in 1:n) { a$add(i) }
a$as.list()
},
inlineExpandingList = {
l <- vector('list', 10)
cap <- 10
len <- 0
for(i in 1:n) {
if(len == cap) {
l <- c(l, vector('list', cap))
cap <- cap*2
}
len <- len + 1
l[[len]] <- i
}
l[1:len]
}
)
}
# We need to repeatedly add an element to a list. With normal list concatenation
# or element setting this would lead to a large number of memory copies and a
# quadratic runtime. To prevent that, this function implements a bare bones
# expanding array, in which list appends are (amortized) constant time.
expandingList <- function(capacity = 10) {
buffer <- vector('list', capacity)
length <- 0
methods <- list()
methods$double.size <- function() {
buffer <<- c(buffer, vector('list', capacity))
capacity <<- capacity * 2
}
methods$add <- function(val) {
if(length == capacity) {
methods$double.size()
}
length <<- length + 1
buffer[[length]] <<- val
}
methods$as.list <- function() {
b <- buffer[0:length]
return(b)
}
methods
}
linkedList <- function() {
head <- list(0)
length <- 0
methods <- list()
methods$add <- function(val) {
length <<- length + 1
head <<- list(head, val)
}
methods$as.list <- function() {
b <- vector('list', length)
h <- head
for(i in length:1) {
b[[i]] <- head[[2]]
head <- head[[1]]
}
return(b)
}
methods
}
# We need to repeatedly add an element to a list. With normal list concatenation
# or element setting this would lead to a large number of memory copies and a
# quadratic runtime. To prevent that, this function implements a bare bones
# expanding array, in which list appends are (amortized) constant time.
namedExpandingList <- function(capacity = 10) {
buffer <- vector('list', capacity)
names <- character(capacity)
length <- 0
methods <- list()
methods$double.size <- function() {
buffer <<- c(buffer, vector('list', capacity))
names <<- c(names, character(capacity))
capacity <<- capacity * 2
}
methods$add <- function(name, val) {
if(length == capacity) {
methods$double.size()
}
length <<- length + 1
buffer[[length]] <<- val
names[length] <<- name
}
methods$as.list <- function() {
b <- buffer[0:length]
names(b) <- names[0:length]
return(b)
}
methods
}
result:
> runBenchmark(1000)
Unit: microseconds
expr min lq mean median uq max neval
env_with_list_ 3128.291 3161.675 4466.726 3361.837 3362.885 9318.943 5
c_ 3308.130 3465.830 6687.985 8578.913 8627.802 9459.252 5
list_ 329.508 343.615 389.724 370.504 449.494 455.499 5
by_index 3076.679 3256.588 5480.571 3395.919 8209.738 9463.931 5
append_ 4292.321 4562.184 7911.882 10156.957 10202.773 10345.177 5
env_as_container_ 24471.511 24795.849 25541.103 25486.362 26440.591 26511.200 5
better_env_as_container 7671.338 7986.597 8118.163 8153.726 8335.659 8443.493 5
linkedList 1700.754 1755.439 1829.442 1804.746 1898.752 1987.518 5
inlineLinkedList 1109.764 1115.352 1163.751 1115.631 1206.843 1271.166 5
expandingList 1422.440 1439.970 1486.288 1519.728 1524.268 1525.036 5
inlineExpandingList 942.916 973.366 1002.461 1012.197 1017.784 1066.044 5
> runBenchmark(10000)
Unit: milliseconds
expr min lq mean median uq max neval
env_with_list_ 357.760419 360.277117 433.810432 411.144799 479.090688 560.779139 5
c_ 685.477809 734.055635 761.689936 745.957553 778.330873 864.627811 5
list_ 3.257356 3.454166 3.505653 3.524216 3.551454 3.741071 5
by_index 445.977967 454.321797 515.453906 483.313516 560.374763 633.281485 5
append_ 610.777866 629.547539 681.145751 640.936898 760.570326 763.896124 5
env_as_container_ 281.025606 290.028380 303.885130 308.594676 314.972570 324.804419 5
better_env_as_container 83.944855 86.927458 90.098644 91.335853 92.459026 95.826030 5
linkedList 19.612576 24.032285 24.229808 25.461429 25.819151 26.223597 5
inlineLinkedList 11.126970 11.768524 12.216284 12.063529 12.392199 13.730200 5
expandingList 14.735483 15.854536 15.764204 16.073485 16.075789 16.081726 5
inlineExpandingList 10.618393 11.179351 13.275107 12.391780 14.747914 17.438096 5
> runBenchmark(20000)
Unit: milliseconds
expr min lq mean median uq max neval
env_with_list_ 1723.899913 1915.003237 1921.23955 1938.734718 1951.649113 2076.910767 5
c_ 2759.769353 2768.992334 2810.40023 2820.129738 2832.350269 2870.759474 5
list_ 6.112919 6.399964 6.63974 6.453252 6.910916 7.321647 5
by_index 2163.585192 2194.892470 2292.61011 2209.889015 2436.620081 2458.063801 5
append_ 2832.504964 2872.559609 2983.17666 2992.634568 3004.625953 3213.558197 5
env_as_container_ 573.386166 588.448990 602.48829 597.645221 610.048314 642.912752 5
better_env_as_container 154.180531 175.254307 180.26689 177.027204 188.642219 206.230191 5
linkedList 38.401105 47.514506 46.61419 47.525192 48.677209 50.952958 5
inlineLinkedList 25.172429 26.326681 32.33312 34.403442 34.469930 41.293126 5
expandingList 30.776072 30.970438 34.45491 31.752790 38.062728 40.712542 5
inlineExpandingList 21.309278 22.709159 24.64656 24.290694 25.764816 29.158849 5
I have added linkedList
and expandingList
and an inlined version of both. The inlinedLinkedList
is basically a copy of list_
, but it also converts the nested structure back into a plain list. Beyond that the difference between the inlined and non-inlined versions is due to the overhead of the function calls.
All variants of expandingList
and linkedList
show O(1) append performance, with the benchmark time scaling linearly with the number of items appended. linkedList
is slower than expandingList
, and the function call overhead is also visible. So if you really need all the speed you can get (and want to stick to R code), use an inlined version of expandingList
.
I've also had a look at the C implementation of R, and both approaches should be O(1) append for any size up until you run out of memory.
I have also changed env_as_container_
, the original version would store every item under index "i", overwriting the previously appended item. The better_env_as_container
I have added is very similar to env_as_container_
but without the deparse
stuff. Both exhibit O(1) performance, but they have an overhead that is quite a bit larger than the linked/expanding lists.
Memory overhead
In the C R implementation there is an overhead of 4 words and 2 ints per allocated object. The linkedList
approach allocates one list of length two per append, for a total of (4*8+4+4+2*8=) 56 bytes per appended item on 64-bit computers (excluding memory allocation overhead, so probably closer to 64 bytes). The expandingList
approach uses one word per appended item, plus a copy when doubling the vector length, so a total memory usage of up to 16 bytes per item. Since the memory is all in one or two objects the per-object overhead is insignificant. I haven't looked deeply into the env
memory usage, but I think it will be closer to linkedList
.
Solution 4
In the Lisp we did it this way:
> l <- c(1)
> l <- c(2, l)
> l <- c(3, l)
> l <- rev(l)
> l
[1] 1 2 3
though it was 'cons', not just 'c'. If you need to start with an empy list, use l <- NULL.
Solution 5
You want something like this maybe?
> push <- function(l, x) {
lst <- get(l, parent.frame())
lst[length(lst)+1] <- x
assign(l, lst, envir=parent.frame())
}
> a <- list(1,2)
> push('a', 6)
> a
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 6
It's not a very polite function (assigning to parent.frame()
is kind of rude) but IIUYC it's what you're asking for.
Related videos on Youtube
Reinhard
Updated on November 27, 2020Comments
-
Reinhard over 3 years
If I have some R list
mylist
, you can append an itemobj
to it like so:mylist[[length(mylist)+1]] <- obj
But surely there is some more compact way. When I was new at R, I tried writing
lappend()
like so:lappend <- function(lst, obj) { lst[[length(lst)+1]] <- obj return(lst) }
but of course that doesn't work due to R's call-by-name semantics (
lst
is effectively copied upon call, so changes tolst
are not visible outside the scope oflappend()
. I know you can do environment hacking in an R function to reach outside the scope of your function and mutate the calling environment, but that seems like a large hammer to write a simple append function.Can anyone suggest a more beautiful way of doing this? Bonus points if it works for both vectors and lists.
-
Dan about 14 yearsR has the immutable data characteristics that are often found in functional languages, hate to say this, but I think you just have to deal with it. It has its pros and its cons
-
Ken Williams about 14 yearsWhen you say "call-by-name" you really mean "call-by-value", right?
-
Reinhard about 14 yearsNo, it's definitely not call-by-value, otherwise this wouldn't be a problem. R actually uses call-by-need (en.wikipedia.org/wiki/Evaluation_strategy#Call_by_need).
-
Fernando over 11 yearsA good idea is to pre-allocate your vector/list: N = 100 mylist = vector('list', N) for (i in 1:N) { #mylist[[i]] = ... } Avoid 'growing' objects in R.
-
KH Kim about 10 yearsI accidentally found the answer here, stackoverflow.com/questions/17046336/… So hard to implement so easy algorithm!
-
smci about 8 years@Fernando: ok but that's not generally applicable, it only works in scenarios where you can know the exact final length in advance (or maybe squeak by in you can use an upper bound to it).
-
xm1 over 6 yearsthe
lappend
approach has the advantage of populating an empty list (generally inside a loop).l1 = list(l1, l2)
, starting with an empty list, puts an empty element as the first element. -
lehiester over 5 yearsShort answer: R has no built-in data structure that supports the objective stated in the title. See JanKanis's answer for a custom implementation.
-
-
Reinhard about 14 yearsThis isn't really appending. What if I have 100 objects and I want to append them to a list programmatically? R has an
append()
function, but it's really a concatenate function and it only works on vectors. -
hadley about 14 years
append()
works on vectors and lists, and it is a true append (which is basically the same as concatenate, so I don't see what your problem is) -
Reinhard about 14 yearsThis doesn't append... it concatenates.
LL
would still have two elements afterC(LL, c="harry")
is called. -
Reinhard about 14 yearsAn append function should mutate an existing object, not create a new one. A true append would not have O(N^2) behavior.
-
Reinhard about 14 yearsThis is basically the behavior that I want, however it still calls append internally, resulting in O(n^2) performance.
-
Dirk Eddelbuettel about 14 yearsJust reassign to LL:
LL <- c(LL, c="harry")
. -
Reinhard over 13 yearsYou are absolutely right. There was a bug in the code and I've fixed it. I've tested the
lappend()
that I've provided and it seems to perform about as well as c() and append(), all of which exhibit O(n^2) behavior. -
Reinhard over 13 yearsI guess this is the best you can do in R.
c()
andappend()
(and thelappend()
I've provided) all exhibit O(n^2) behavior. I was hoping for O(n lg n), which is what you would get with a dynamically growing sequence data structure in most languages, but it doesn't seem like R has that, at least built in. -
Alexandre Rademaker over 13 yearsThis only works with strings. If a, b and c are integer vectors, the behavior is completely different.
-
eold over 12 yearsThis appears to have quadratic time complexity again. The problem is obviously that list/vector resize is not implemented in the way it is usually implemented in most languages.
-
DavidM over 12 yearsYes-- looks like the appending at the end is very slow-- probably b/c lists are recursive, and R is best at vector operations rather than loop type operations. Its much better to do:
-
DavidM over 12 yearssystem.time(for(i in c(1:10000) mylist[i]=i) (a few seconds), or better yet do it all in one operation: system.time(mylist=list(1:100000)) (less than a second), then modifying that preallocated list with the for loop will also be faster.
-
joran over 12 yearsI don't think this is the kind of appending the OP was looking for.
-
Ankit Roy over 12 yearsGood find @AlexandreRademaker. The badness of
c(list(a=3, b=c(4, 5)), c=c(6, 7))
is staggering. -
Dirk Eddelbuettel over 12 yearsTry
identical(c(list(a=3, b=c(4, 5), c=c(6, 7))), list(a=3, b=c(4, 5), c=c(6, 7)))
and see that they are the same -- thec(...)
around thelist()
is a null-op. So what exactly is bad here? -
Ankit Roy over 12 years@Dirk: You have the parens nested differently than I do. My call to
c()
has 2 arguments: the list I'm trying to append to, namelylist(a=3, b=c(4, 5))
, and the item I'm trying to append, namelyc=c(6, 7)
. If you use my approach, you'll see that 2 list items are appended (6
and7
, with namesc1
andc2
) instead of a single 2-element vector namedc
as is clearly intended! -
Jon Claus almost 11 yearsI don't believe this works if you want to append an item whose value is
NULL
, although that is a rather unlikely scenario. -
metakermit over 10 yearsExcellent! All the other solutions return some weird list of lists.
-
KH Kim about 10 yearsif the intention is to append $c=c(6,7), we can do c(list(a=3, b=c(4,5)), list(c=c(6,7))
-
eold almost 10 yearsA rather simple experiment shows that the provided code snippets actually has linear time overhead (w.r.t vector size):
v <- c(); for (i in 1:n) v <- c(v, i)
As I continue to double n, the time quadruples. Hence, this is not what the OP asked for (amortized constant-time append). -
Matthew over 9 yearsSo is the conclusion
mylist <- list(mylist, list(obj))
? If yes it would be nice to modify the answer -
rimorob over 9 yearsFor real bonus points, how do we do c(LL, x=y) so that a key is created for the value of x and its value is the value of y? I'm sure eval is used somehow, but I can't figure it out!
-
Ian Gow over 9 yearsI agree with Matthew that it would be nice to modify (I'd say "correct") the answer. The help for
c()
itself points out the problems whenobj
is a vector. My suggested correction/clarification along these lines was rejected as "makes no sense as an edit." -
Palec about 9 yearsIn Lisp, prepending to a list is an O(1) operation, while appending runs in O(n), @flies. The need for reversion is outweighed by performance gain. This is not the case in R. Not even in pairlist, which generally resembles List lists the most.
-
flies almost 9 years@Palec "This is not the case in R" - I'm not sure which "this" you're referring to. Are you saying that appending is not O(1), or it's not O(n)?
-
Palec almost 9 yearsI’m saying that if you were coding in Lisp, your approach would be inefficient, @flies. That remark was meant to explain why the answer is written as it is. In R, the two approaches are on par performance-wise, AFAIK. But now I’m not sure about the amortized complexity. Haven’t touched R since about the time my previous comment was written.
-
zach over 8 yearsarrrg! nice answer. I just been struggling with combining lists of lists. I love R but the syntax can be frustrating.
-
phonetagger over 8 yearsTo the reader: Please read JanKanis's answer, which provides a very practical extension to my findings above, and dives a bit into the overhead of various solutions given the internal workings of the C implementation of R.
-
Picarus over 8 yearsNot sure the list option implements what it is required: >length(c(c(c(list(1)),list(2)),list(3))) [1] 3 > length(list(list(list(list(1)),list(2)),list(3))) [1] 2. Looks more like nested lists.
-
phonetagger over 8 years@Picarus - I think you're right. I'm not working with R anymore, but thankfully JanKanis posted an answer with a much more useful O(1) solution and notes the issue you identified. I'm sure JanKanis will appreciate your upvote.
-
Picarus over 8 years@phonetagger, you should edit your answer. Not everybody will read all the answers.
-
Picarus over 8 yearswhat is the point of keeping the list option if it does not solve the problem we are trying to solve?
-
JanKanis over 8 yearsIn R, this approach will be O(n). The
c()
function copies its arguments into a new vector/list and returns that. -
JanKanis over 8 years@Picarus I'm not sure what you mean. Why I kept it in the benchmark? As comparison with the other options. The
list_
option is faster and could be useful if you don't need to convert to a normal list, i.e. if you use the result as a stack. -
Carlos Cinelli about 8 years"not a single answer has addressed the actual question" --> The problem is that the original question was not about algorithm complexity, take a look at the editions of the question. The OP asked first how to append an element in a list, than, several months later, he changed the question.
-
phonetagger about 8 years@CarlosCinelli - You are correct, thanks for pointing that out. I have revised the opening of my answer accordingly.
-
JanKanis about 8 years@Gabor Csardi posted a faster way to convert environments back to lists in a different question at stackoverflow.com/a/29482211/264177. I benchmarked that as well on my system. It is about twice as fast better_env_as_container but still slower than linkedList and expandingList.
-
xappppp about 8 yearsWhat I want to add is that it does give a two level nested list, but that's it. The way how list and unlist working is not very clear to me, but this is the result by testing the code
-
Sergio over 7 yearsThis is not appending elements in a list. Here you are increasing the elements of the integer vector, which is the only element of the list. The list has only one element, an integer vector.
-
Nick Dong about 7 years
newlist <- list(oldlist, list(someobj))
seems that the new sublist will recursively to join the oldlist. andc(LL, c="harry")
only works for strings a number list.lst[[length(lst)+1]] <- obj
works correctly for me. -
Nick Dong about 7 years
Ent = list(con = list(1:3)); con = 2:4; Ent$con = c(Ent$con, list(con)); con = 3:5; Ent$con = c(Ent$con, list(con))
this works for me either. list--sublist--vector -
WestCoastProjects almost 6 yearsThis is great info: would never ever have guessed that the
list = list
were not only the winner - but by 1 to 2 orders or magnitude! -
JD Long almost 6 yearsdoesnt look like R to me... Python?
-
5th over 5 yearsI made an edit and tried it out: It is freaking slow. Better use the
c()
orlist
-method. Both are way faster. -
nbenn about 5 yearsLooking a the code for
rlist::list.append()
, it's essentially a wrapper aroundbase::c()
. -
Ana Nimbus over 4 yearsDeeply nested (n=99999) lists seem manageable and tolerable for certain applications: Anyone want to benchmark nestoR? (I'm still a bit of a noob at the
environment
stuff I used for nestoR.) My bottleneck is almost always human time spent coding and doing data analysis, but I appreciate the benchmarks I've found on this post. As for memory overhead, I would not mind up to about a kB per node for my applications. I hold on to large arrays, etc. -
Luke over 3 yearsThe method
newlist <- list(oldlist, list(someobj))
doesn't work. It doesn't append, it creates a new list with two elements where the first element isoldlist
. The function append() results in the desired behavior:newlist <- append(oldlist, list(someobj))
-
David Bellot over 3 yearsThis comparison is not valid:
list_
does not create a list of integers as expected. It will contain list of lists. At each iteration, a new list is created with 2 elements, one is the new integer and the other one is the previous version of the same list. Because values are not overwritten, a simple copy by reference is done internally. It's why it's so fast, but we don't have the same object at all. For all the other examples, we have a list of length n+1 -
David Bellot over 3 yearsSee my comment above. The
list_
does not provide the same result as the others. So this comparison is not valid! -
Cron Merdek over 3 years@DavidBellot it is correct, it is meant there for benchmark level. You can flatten it at then end :)