Iterating over lists stored in data.frame in R

17,046

Solution 1

I think you confuse a list and a data.frame. I guess that your final is object is a list.

To iterate over the list You can use rapply. It is a recursive version of lapply.

For example:

## I create some reproducible example

cluster1 <- list(a='a',b='b')
cluster2 <- list(c='aaa',d='bbb')
clusters <- list(cluster1,cluster2)
final <- list(clusters)

So using rapply

rapply(final,f=print)
[1] "a"
[1] "b"
[1] "aaa"
[1] "bbb"
    a     b     c     d 
  "a"   "b" "aaa" "bbb" 

Update after edit by OP

Using lapply, I loop through the name of the list. For each name, I get the element list using [[ ( you can use [ if you wand to get names and heder for files), then I write the file using write.table. Here I use the name of the element in the list to create the file name. in your case you will have file name as number.(1.txt,...)

    lapply(names(final$clusters),
                      function(x)
                             write.table(x=final$clusters[[x]],
                                         file=paste(x,'.txt',sep='')))

Solution 2

I think the primary problem here is that the way you iterate here is wrong.

I think that something like this would work better:

for (j in final$clusters){
    for (i in final$clusters[j]){
        print i
    }
}

here is the documentation for loops: http://manuals.bioinformatics.ucr.edu/home/programming-in-r#TOC-For-Loop for subsetting: http://www.statmethods.net/management/subset.html

good luck

Share:
17,046
blep
Author by

blep

Updated on June 19, 2022

Comments

  • blep
    blep over 1 year

    I think this is a beginner question, but I don't appear to have the right vocabulary for an effective Google search.

    I have a data.frame, final, which contains a list of clusters, each of which is a list of strings.

    I would like to iterate over the list of strings in each cluster: a for loop within a for loop.

    for (j in final$clusters){
        for (i in final$clusters$`j`){
            print final$clusters$`j`[i]
        }
    }
    

    j corresponds to the lists in clusters, and i corresponds to the items in clusters[j]

    I was trying to do this by using the length of each cluster, which I thought would be something like length(final$clusters[1]), but that gives 1, not the length of list.

    Also, final$clusters[1] gives $'1', and on the next line, all the strings in cluster 1.

    Thanks.

    EDIT: output of dput(str(final)), as requested:

    List of 2
     $ clusters     :List of 1629
      ..$ 1   :
      ..$ 2   : 
      ..$ 3   : 
      ..$ 4   : 
      ..$ 5   : 
      ..$ 6   : 
      ..$ 7   : 
      ..$ 8   : 
      ..$ 9   : 
      ..$ 10  : 
      .. [list output truncated]
     $ cluster_stats: num [1:1629, 1:6] 0.7 0.7 0.7 0.7 0.7 0.7 ...
      ..- attr(*, "dimnames")=List of 2
      .. ..$ : chr [1:1629] "1" "2" "3" "4" ...
      .. ..$ : chr [1:6] "min" "qu1" "median" "mean" ...
    NULL
    
  • blep
    blep almost 11 years
    Thanks for catching this! However, it doesn't entirely solve my problem. (I would upvote your answer, but I don't have the reputation to do that. sorry.)
  • blep
    blep almost 11 years
    So rapply is working as you stated, to print out the list of lists (thanks for the clarification). However, I'd like to print out only one of the lists at a time (actually, I was hoping to use sink to print each of clusters to a different file), but I cannot get print to work, using @pipo98 's help and yours: for (j in final$clusters){ rapply(final$clusters[j], f = print) } returns many lines of NULL.
  • agstudy
    agstudy almost 11 years
    @dd3 no need to combine rapply with for. Rapply will go recusrivly through the list to get the leafs.. can you please dput the dput(str(final)) and add it to your question...
  • blep
    blep almost 11 years
    the problem is I only want to do this for one of the sublists at a time. In your example, I'd like to get only the output of cluster1, write that to a file, and then do the same for the other clusters, each one writing to a different file. I've added the output you requested to my question. Thanks for your help.