Overriding "Variables not shown" in dplyr, to display all columns from df

19,045

Solution 1

There's (now) a way of overriding the width of columns that gets printed out. If you run this command all will be well

options(dplyr.width = Inf)

I wrote it up here.

Solution 2

You might like glimpse :

> movies %>%
+  group_by(year) %>%
+  summarise(Length = mean(length), Title = max(title),
+   Dramaz = sum(Drama), Actionz = sum(Action),
+   Action = sum(Action), Comedyz = sum(Comedy)) %>%
+  mutate(Year1 = year + 1) %>% glimpse()
Variables:
$ year    (int) 1893, 1894, 1895, 1896, 1897, 1898, 1899, 1900, 1901, 1902,...
$ Length  (dbl) 1.000000, 1.000000, 1.000000, 1.307692, 1.000000, 1.000000,...
$ Title   (chr) "Blacksmith Scene", "Sioux Ghost Dance", "Photographe", "Ve...
$ Dramaz  (int) 0, 0, 0, 1, 0, 1, 2, 2, 5, 1, 2, 3, 4, 5, 1, 8, 14, 14, 14,...
$ Actionz (int) 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 3, 0, 0, 0, 0, 3, 0, 0, 1, 0,...
$ Action  (int) 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 3, 0, 0, 0, 0, 3, 0, 0, 1, 0,...
$ Comedyz (int) 0, 0, 0, 1, 2, 2, 1, 5, 8, 2, 8, 10, 6, 2, 6, 8, 7, 2, 2, 4...
$ Year1   (dbl) 1894, 1895, 1896, 1897, 1898, 1899, 1900, 1901, 1902, 1903,...NULL

Solution 3

dplyr has its own printing functions for dplyr objects. In this case, the object that is the result of your operation is tbl_df. The matching print function is then dplyr:::print.tbl_df. This reveals that trunc_mat is the function responsible for what is printed and not, including which variables.

Sadly, dplyr:::print.tbl_df does not pass on any parameters to trunc_mat and trunc_mat also does not support choosing which variables are shown (only how many rows). A workaround is to cast the result of dplyr to a data.frame and use head:

res = movies %.% 
 group_by(year) %.% 
 summarise(Length = mean(length), Title = max(title), 
  Dramaz = sum(Drama), Actionz = sum(Action), 
  Action = sum(Action), Comedyz = sum(Comedy)) %.% 
 mutate(Year1 = year + 1)

head(data.frame(res))
  year    Length                       Title Dramaz Actionz Action Comedyz
1 1898  1.000000 Pack Train at Chilkoot Pass      1       0      0       2
2 1894  1.000000           Sioux Ghost Dance      0       0      0       0
3 1902  3.555556     Voyage dans la lune, Le      1       0      0       2
4 1893  1.000000            Blacksmith Scene      0       0      0       0
5 1912 24.382353            Unseen Enemy, An     22       0      0       4
6 1922 74.192308      Trapped by the Mormons     20       0      0      16
  Year1
1  1899
2  1895
3  1903
4  1894
5  1913
6  1923

Solution 4

So, this is a bit old, but I found this when looking for answers to same problem. I came up with this solution that holds to the spirit of piping but identical in function to the accepted answer (note that the pipe symbol %.% is deprecated in favor of %>%)

movies %>% 
    group_by(year) %>% 
    summarise(Length = mean(length), Title = max(title), 
    Dramaz = sum(Drama), Actionz = sum(Action), 
    Action = sum(Action), Comedyz = sum(Comedy)) %>% 
    mutate(Year1 = year + 1) %>%
    as.data.frame %>%
    head

Solution 5

movies %.% group_by(year) %.% ....... %.% print.default

dplyr uses, instead of the default print option,dplyr:::print.tbl_df to make sure your screen doesn't overload with huge data-sets. When you've finally whittled your stuff down to what you want and don't want to be saved from your own mistakes anymore, just stick print.default on the end to spit out everything.


BTW, methods(print) shows how many packages need to write their own print functions (think about, eg, igraph or xts --- these are new data-types so you need to tell them how to be displayed on the screen).

HTH the next googler.

Share:
19,045
Hugh
Author by

Hugh

Updated on June 24, 2022

Comments

  • Hugh
    Hugh about 2 years

    When I have a column in a local data frame, sometimes I get the message Variables not shown such as this (ridiculous) example just needed enough columns.

    library(dplyr)
    library(ggplot2) # for movies
    
    movies %.% 
     group_by(year) %.% 
     summarise(Length = mean(length), Title = max(title), 
      Dramaz = sum(Drama), Actionz = sum(Action), 
      Action = sum(Action), Comedyz = sum(Comedy)) %.% 
     mutate(Year1 = year + 1)
    
       year    Length                       Title Dramaz Actionz Action Comedyz
    1  1898  1.000000 Pack Train at Chilkoot Pass      1       0      0       2
    2  1894  1.000000           Sioux Ghost Dance      0       0      0       0
    3  1902  3.555556     Voyage dans la lune, Le      1       0      0       2
    4  1893  1.000000            Blacksmith Scene      0       0      0       0
    5  1912 24.382353            Unseen Enemy, An     22       0      0       4
    6  1922 74.192308      Trapped by the Mormons     20       0      0      16
    7  1895  1.000000                 Photographe      0       0      0       0
    8  1909  9.266667              What Drink Did     14       0      0       7
    9  1900  1.437500      Uncle Josh's Nightmare      2       0      0       5
    10 1919 53.461538     When the Clouds Roll by     17       2      2      29
    ..  ...       ...                         ...    ...     ...    ...     ...
    Variables not shown: Year1 (dbl)
    

    I want to see Year1! How do I see all the columns, preferably by default.

  • Hugh
    Hugh over 10 years
    +1 for discovering glimpse. Personally, the principal reason I like to see all columns is as a convenient way to check whether the column I've added (through summarise or mutate) has actually done what I intended. So glimpse isn't quite right for this.
  • hadley
    hadley over 10 years
    Pull requests are always welcomed :) But print.tbl_df probably does need an all_columns argument.
  • rrs
    rrs over 9 years
    I think that should be options with an "s". I can't edit since edits must be 10 characters.
  • wint3rschlaefer
    wint3rschlaefer about 9 years
    For the latest dplyr version, use %>% instead of %.%
  • Javier Fajardo
    Javier Fajardo about 7 years
    This is a nice option, but is not so useful when you have too many columns. It happened to me in a df with some 200 columns that they were displayed but the order between rows and columns was lost. Also, most of the rows were truncated at some point because of too many characters. I wanted to share the command to bring back the default behaviour, which is: 'options(dplyr.width = NULL)'
  • BMLopes
    BMLopes over 2 years
    And the dataset moved from ggplot to ggplot2movies. So, now we should use library(ggplot2movies).