Access lapply index names inside FUN

78,185

Solution 1

Unfortunately, lapply only gives you the elements of the vector you pass it. The usual work-around is to pass it the names or indices of the vector instead of the vector itself.

But note that you can always pass in extra arguments to the function, so the following works:

x <- list(a=11,b=12,c=13) # Changed to list to address concerns in commments
lapply(seq_along(x), function(y, n, i) { paste(n[[i]], y[[i]]) }, y=x, n=names(x))

Here I use lapply over the indices of x, but also pass in x and the names of x. As you can see, the order of the function arguments can be anything - lapply will pass in the "element" (here the index) to the first argument not specified among the extra ones. In this case, I specify y and n, so there's only i left...

Which produces the following:

[[1]]
[1] "a 11"

[[2]]
[1] "b 12"

[[3]]
[1] "c 13"

UPDATE Simpler example, same result:

lapply(seq_along(x), function(i) paste(names(x)[[i]], x[[i]]))

Here the function uses "global" variable x and extracts the names in each call.

Solution 2

This basically uses the same workaround as Tommy, but with Map(), there's no need to access global variables which store the names of list components.

> x <- list(a=11, b=12, c=13)
> Map(function(x, i) paste(i, x), x, names(x))
$a
[1] "a 11"

$b
[1] "b 12"

$c
[1] "c 13

Or, if you prefer mapply()

> mapply(function(x, i) paste(i, x), x, names(x))
     a      b      c 
"a 11" "b 12" "c 13"

Solution 3

UPDATE for R version 3.2

Disclaimer: this is a hacky trick, and may stop working in the the next releases.

You can get the index using this:

> lapply(list(a=10,b=20), function(x){parent.frame()$i[]})
$a
[1] 1

$b
[1] 2

Note: the [] is required for this to work, as it tricks R into thinking that the symbol i (residing in the evaluation frame of lapply) may have more references, thus activating the lazy duplication of it. Without it, R will not keep separated copies of i:

> lapply(list(a=10,b=20), function(x){parent.frame()$i})
$a
[1] 2

$b
[1] 2

Other exotic tricks can be used, like function(x){parent.frame()$i+0} or function(x){--parent.frame()$i}.

Performance Impact

Will the forced duplication cause performance loss? Yes! here are the benchmarks:

> x <- as.list(seq_len(1e6))

> system.time( y <- lapply(x, function(x){parent.frame()$i[]}) )
user system elapsed
2.38 0.00 2.37
> system.time( y <- lapply(x, function(x){parent.frame()$i[]}) )
user system elapsed
2.45 0.00 2.45
> system.time( y <- lapply(x, function(x){parent.frame()$i[]}) )
user system elapsed
2.41 0.00 2.41
> y[[2]]
[1] 2

> system.time( y <- lapply(x, function(x){parent.frame()$i}) )
user system elapsed
1.92 0.00 1.93
> system.time( y <- lapply(x, function(x){parent.frame()$i}) )
user system elapsed
2.07 0.00 2.09
> system.time( y <- lapply(x, function(x){parent.frame()$i}) )
user system elapsed
1.89 0.00 1.89
> y[[2]]
[1] 1000000

Conclusion

This answer just shows that you should NOT use this... Not only your code will be more readable if you find another solution like Tommy's above, and more compatible with future releases, you also risk losing the optimizations the core team has worked hard to develop!


Old versions' tricks, no longer working:

> lapply(list(a=10,b=10,c=10), function(x)substitute(x)[[3]])

Result:

$a
[1] 1

$b
[1] 2

$c
[1] 3

Explanation: lapply creates calls of the form FUN(X[[1L]], ...), FUN(X[[2L]], ...) etc. So the argument it passes is X[[i]] where i is the current index in the loop. If we get this before it's evaluated (i.e., if we use substitute), we get the unevaluated expression X[[i]]. This is a call to [[ function, with arguments X (a symbol) and i (an integer). So substitute(x)[[3]] returns precisely this integer.

Having the index, you can access the names trivially, if you save it first like this:

L <- list(a=10,b=10,c=10)
n <- names(L)
lapply(L, function(x)n[substitute(x)[[3]]])

Result:

$a
[1] "a"

$b
[1] "b"

$c
[1] "c"

Or using this second trick: :-)

lapply(list(a=10,b=10,c=10), function(x)names(eval(sys.call(1)[[2]]))[substitute(x)[[3]]])

(result is the same).

Explanation 2: sys.call(1) returns lapply(...), so that sys.call(1)[[2]] is the expression used as list argument to lapply. Passing this to eval creates a legitimate object that names can access. Tricky, but it works.

Bonus: a second way to get the names:

lapply(list(a=10,b=10,c=10), function(x)eval.parent(quote(names(X)))[substitute(x)[[3]]])

Note that X is a valid object in the parent frame of FUN, and references the list argument of lapply, so we can get to it with eval.parent.

Solution 4

I've had the same problem a lot of times... I've started using another way... Instead of using lapply, I've started using mapply

n = names(mylist)
mapply(function(list.elem, names) { }, list.elem = mylist, names = n)

Solution 5

You could try using imap() from purrr package.

From the documentation:

imap(x, ...) is short hand for map2(x, names(x), ...) if x has names, or map2(x, seq_along(x), ...) if it does not.

So, you can use it that way :

library(purrr)
myList <- list(a=11,b=12,c=13) 
imap(myList, function(x, y) paste(x, y))

Which will give you the following result:

$a
[1] "11 a"

$b
[1] "12 b"

$c
[1] "13 c"
Share:
78,185

Related videos on Youtube

Robert Kubrick
Author by

Robert Kubrick

Updated on November 16, 2020

Comments

  • Robert Kubrick
    Robert Kubrick over 3 years

    Is there a way to get the list index name in my lapply() function?

    n = names(mylist)
    lapply(mylist, function(list.elem) { cat("What is the name of this list element?\n" })
    

    I asked before if it's possible to preserve the index names in the lapply() returned list, but I still don't know if there is an easy way to fetch each element name inside the custom function. I would like to avoid to call lapply on the names themselves, I'd rather get the name in the function parameters.

  • Robert Kubrick
    Robert Kubrick about 12 years
    How is the 'i' parameter initialized in the custom function?
  • Robert Kubrick
    Robert Kubrick about 12 years
    Got it, so lapply() really applies to the elements returned by seq_along. I got confused because the custom function parameters were reordered. Usually the iterated list element is the first parameter.
  • Tommy
    Tommy about 12 years
    Updated answer and changed first function to use y instead of x so that it is (hopefully) clearer that the function can call it's arguments anything. Also changed vector values to 11,12,13.
  • Tommy
    Tommy about 12 years
    @RobertKubrick - Yeah, I probably tried to show too many things at once... You can name the arguments anything and have them in any order.
  • Tommy
    Tommy about 12 years
    Your function only returns NULL?! So lapply(x, function(x) NULL) gives the same answer...
  • Tommy
    Tommy about 12 years
    @DWin - I think it is correct (and applies to lists as well) ;-) ...But please prove me wrong!
  • Tommy
    Tommy about 12 years
    Note that lapply always adds the names from x to the result afterwards.
  • IRTFM
    IRTFM about 12 years
    I thought I had proven you wrong but on reflection I see that all of my "proofs" refer out to the variable in the calling environment. I still think using "x" in your demonstration codeis pulling in information from the global environment, though.
  • IRTFM
    IRTFM about 12 years
    Yes. Agree that is the lesson of this exercise.
  • Tommy
    Tommy about 12 years
    @DWin - Well, the function part is function(y, n, i) { paste(n[[i]], y[[i]]) } and does not refer to x.
  • Anusha
    Anusha almost 10 years
    The code lapply(list(a=10,b=10,c=10), function(x)substitute(x)[[3]]) is returning all to be 3. Would you explain how this 3 was chosen ? and reason for the discrepancy ? Is it equal to length of list, in this case, 3. Sorry if this is a basic question but would like to know how to apply this in a general case.
  • Ferdinand.kraft
    Ferdinand.kraft over 9 years
    @Anusha, indeed, that form is not working anymore... But the lapply(list(a=10,b=10,c=10), function(x)eval.parent(quote(names(X)))[substitute(x)[[3]]]) works... I'll check what's going on.
  • forecaster
    forecaster almost 9 years
    @Ferdinand.kraft, lapply(list(a=10,b=10,c=10), function(x)eval.parent(quote(names(X)))[substitute(x)[[3]]]) is no longer working, and gives an error, Error in eval.parent(quote(names(X)))[substitute(x)[[3]]] : invalid subscript type 'symbol' is there an easy way to fix this ?
  • forecaster
    forecaster almost 9 years
    Thank you so much @Ferdinand.kraft
  • Vadym B.
    Vadym B. about 8 years
    Why do you pass x as argument to lapply() ? If you can just refer to x inside of the anonymous function?
  • flies
    flies over 7 years
    This is certainly the simplest solution.
  • smci
    smci about 7 years
    @flies: yes, except it's bad practice to hard-code variable mylist inside the function. Better still to do function(mylist, nm) ...
  • Ferroao
    Ferroao almost 7 years
    If the names of the objects in the resulting list are to be preserved as in the original (instead of [[1]] show "a" in the output) in a one-line procedure is possible?
  • phil_t
    phil_t over 6 years
    @Tommy, I used this to construct an lapply statement where I am trying to rename one column in all the dataframes in the list. The statement is - all_df = lapply(seq_along(all_df), function(i){nm = substr(names(all_df)[[i]], 1, nchar(names(all_df)[[i]])-3); colnames(all_df[[i]])[length(all_df[[i]])] = paste0(nm, "_rank"); all_df[[i]]}) names(all_df) = c("acc_df", "ag_df", "ia_df", "mom_df", "noa_df", "pp_df", "roa_df"). It works fine but the names of the dataframes in the resulting list are not preserved? Suggestions?
  • merv
    merv about 6 years
    I also prefer this, but this answer is a duplicate of a previous one.
  • xm1
    xm1 almost 6 years
    I have a list of list of data.frame. This solution avoided nested loops: x <- unlist(unlist(r,F),F); Reduce(rbind,lapply(seq_along(x), function(i) cbind(caso=names(x)[i], x[[i]]))). Tks, @Tommy
  • emilBeBri
    emilBeBri about 5 years
    This is definitely the best solution of the bunch.
  • JJJ
    JJJ about 5 years
    When using mapply(), notice the SIMPLIFY option, which defaults to true. In my case, that made the whole thing into a large matrix when I only wanted to a simple list apply. Setting it to F (inside the mapply()) made it run as intended.
  • moodymudskipper
    moodymudskipper over 4 years
    this is not robust at all, use with caution
  • Brooks Ambrose
    Brooks Ambrose over 3 years
    Following @VadymB., this is more parsimonious: lapply(names(x), function(i) paste(i,x[[i]])). If you want the result to be named, then sapply(names(x), function(i) paste(i,x[[i]]),simplify = F,USE.NAMES = T).