Warning: Factor contains implicit NA

15,884

Solution 1

With:

require(shiny)
require(tidyverse)

# Create some sample data:
year <- rep(2000:2018, each=3)

publ <- rep(strrep(c("Pub 1", "Pub2", "pub3"), 1), 19)

Global_Sales <- rep(sample(1:100,19),3)
# Create a observation with NA:
newline <- c(NA, NA, 33)

df <- data.frame(Year = year, Publisher = publ, Global_Sales = Global_Sales)
df <- rbind(df,newline)
df <- na.omit(df)

pubSales<-df %>%  group_by(Publisher, Year)  %>%
  summarise(Global_Sales=sum(Global_Sales)) 

pubSales$Publisher <- as.character(pubSales$Publisher) 

the error does no longer appear. As long as the data you work with in shiny does not contain factors (which is where the "implicit NA" comes from), the error did not appear with my sample data.

Solution 2

The warning pops up because NA is non a level in a factor. It is just missing. The warning reminds you there is a "hidden" level in the factor that will not show up when you perform operations on the factor.

For example, a basic factor:

a.factor <- as.factor(c('a', 'b', 'c', NA))

Only has 3 levels when we print it or summarise in a quick table:

> print(a.factor)
[1] a    b    c    <NA>
Levels: a b c

> table(a.factor)
a.factor
a b c 
1 1 1 
Share:
15,884

Related videos on Youtube

Romain B
Author by

Romain B

Studying IT.

Updated on June 04, 2022

Comments

  • Romain B
    Romain B almost 2 years

    I am new to R and Shiny and I am trying to create an interactive plot with ggplot2. When the user check the checkbox, he has access to a multiple select field to custom the plot.

    The original dataframe contains missing values identified as "N/A" in Publisher and Year column. I removed the lines containing NAs with complete.cases so it shouldn't have any NA left.

    I run my app : OK. I get to the default plot : OK. I check the checkbox : Warning: Factor 'Publisher' contains implicit NA, consider using 'forcats::fct_explicit_na'

    I'd like to remove this warning, at least understand it. If you have any additional comment please do so : my goal is to get better.

    app.R :

    df<-read.csv("vgsales.csv")
    df$Year[df$Year=="N/A"]<-NA
    df$Year<-factor(df$Year)
    df$Publisher[df$Publisher=="N/A"]<-NA
    df$Publisher<-factor(df$Publisher)
    df<-df[complete.cases(df),]
    
    pubSales<-na.omit(df
        %>% group_by(Publisher, Year) 
        %>% summarise(Global_Sales=sum(Global_Sales))
    )
    pubSales<-pubSales[order(pubSales$Year),]
    
    top5Pub<-head(unique(pubSales[order(-pubSales$Global_Sales),]$Publisher),5)
    
    ui <- navbarPage("Video Games Sales",
        tabPanel("Publishers",
            mainPanel(
                titlePanel(
                    title = "Publishers sales"
                ),
                sidebarPanel(
                    radioButtons(
                        "pubOptions",
                        "Options",
                        c("Top 5 Publishers"="topFivePub",
                          "Custom Publishers"="customPub"),
                        selected="topFivePub"
                    ),
                    uiOutput("customPubUI")
                ),
                mainPanel(
                    plotOutput("pubPlot")
                ),
                width=12
            )
        )
    )
    
    server <- function(input, output, session) {
    
        output$customPubUI<-renderUI({
            if(input$pubOptions=="customPub"){
                selectInput(
                    "selectedPub",
                    "Editeurs",
                    pubSales$Publisher,
                    multiple=TRUE
                )
            }
        })
    
        output$pubSales<-renderTable(pubSales)
        output$pubPlot<-renderPlot({
            ggplot()+
                if(input$pubOptions=="customPub"){
                    geom_line(
                        data=pubSales[pubSales$Publisher %in% input$selectedPub,],
                        aes(x=Year,y=Global_Sales,colour=Publisher,group=Publisher)
                    )
                }else{
                    geom_line(
                        data=pubSales[pubSales$Publisher %in% top5Pub,],
                        aes(x=Year,y=Global_Sales,colour=Publisher,group=Publisher)
                    )
                }
        })
    
    }
    
    shinyApp(ui, server)
    
    • heck1
      heck1 about 5 years
      Please, if it is possible: provide example data so your question becomes reproducible.
    • Sonny
      Sonny about 5 years
      Does that error come even if you convert Publisher from factor to character?
    • Romain B
      Romain B about 5 years
      @heck1 I got my dataset from kaggle : kaggle.com/gregorut/videogamesales
    • Romain B
      Romain B about 5 years
      @Sonny I'll try it as soon as possible
    • Aaron Hayman
      Aaron Hayman about 5 years
      Something of an aside, but in read.csv you can use an argument na.strings to tell R the format of NA in the data you are reading, e.g. df<-read.csv("vgsales.csv" , na.strings = "N/A"), which can save you from having to convert them later.
    • Romain B
      Romain B about 5 years
      @AaronHayman Thank you !
  • Romain B
    Romain B about 5 years
    Thanks. If you have some explanations it would be appreciated
  • Romain B
    Romain B over 4 years
    I think I get it. Thank you