wordcloud package: get “Error in strwidth(…) : invalid 'cex' value”

r tm

17,861

Solution 1

You have a typo in TataMotors twitter account. It should be spelled 'TataMotors', not 'TataMotor'. As a result, one column in your term matrix is empty and when cex is calculated it get assigned NAN.

Fix the typo and the rest of the code works fine. Good luck!

enter image description here

Solution 2

I spotted the empty-column issue in a different application throwing the same error. In my case it was because of the removeSparseTerms command applied to a document term matrix. Using str() helped me identify the bug.

The input variable (slightly edited) had 289 columns:

> str(corpus.dtm)
List of 6
$ i       : int [1:443] 3 4 6 8 10 12 15 18 19 21 ...
$ j       : int [1:443] 105 98 210 93 287 249 126 223 129 146 ...
$ v       : num [1:443] 1 1 1 1 1 1 1 1 1 1 ...
$ nrow    : int 1408
$ ncol    : int 289
$ dimnames:List of 2
..$ Docs : chr [1:1408] "character(0)" "character(0)" "character(0)" "character(0)" ...
..$ Terms: chr [1:289] "word1" "word2" "word3" "word4" ...
- attr(*, "class")= chr [1:2] "DocumentTermMatrix" "simple_triplet_matrix"
- attr(*, "weighting")= chr [1:2] "term frequency" "tf"

The command was:

removeSparseTerms(corpus.dtm,0.90)->corpus.dtm.frequent

And the result had 0 columns:

> str(corpus.dtm.frequent)
List of 6
$ i       : int(0) 
$ j       : int(0) 
$ v       : num(0) 
$ nrow    : int 1408
$ ncol    : int 0
$ dimnames:List of 2
..$ Docs : chr [1:1408] "character(0)" "character(0)" "character(0)" "character(0)" ...
..$ Terms: NULL
- attr(*, "class")= chr [1:2] "DocumentTermMatrix" "simple_triplet_matrix"
- attr(*, "weighting")= chr [1:2] "term frequency" "tf"

Raising the sparsity coefficient from 0.90 to 0.95 solved the issue. For a wordier document I went up to 0.999 in order to have a non-empty result after removing the sparse terms.

Empty columns are a good thing to check out when this error occurs.

17,861

Abhishek Kapoor

Updated on June 04, 2022

Comments

Abhishek Kapoor almost 2 years

I am using the tm and wordcloud packages in R 2.15.1. I am trying to make a word cloud Here is the code:

maruti_tweets = userTimeline("Maruti_suzuki", n=1000,cainfo="cacert.pem")
hyundai_tweets = userTimeline("HyundaiIndia", n=1000,cainfo="cacert.pem")
tata_tweets = userTimeline("TataMotor", n=1000,cainfo="cacert.pem")
toyota_tweets = userTimeline("Toyota_India", n=1000,cainfo="cacert.pem")
# get text
maruti_txt = sapply(maruti_tweets, function(x) x$getText())
hyundai_txt = sapply(hyundai_tweets, function(x) x$getText())
tata_txt = sapply(tata_tweets, function(x) x$getText())
toyota_txt = sapply(toyota_tweets, function(x) x$getText())
clean.text = function(x)

{
   # tolower
   x = tolower(x)
   # remove rt
   x = gsub("rt", "", x)
   # remove at
   x = gsub("@\\w+", "", x)
   # remove punctuation
   x = gsub("[[:punct:]]", "", x)
   # remove numbers
   x = gsub("[[:digit:]]", "", x)
   # remove links http
   x = gsub("http\\w+", "", x)
   # remove tabs
   x = gsub("[ |\t]{2,}", "", x)
   # remove blank spaces at the beginning
   x = gsub("^ ", "", x)
   # remove blank spaces at the end
   x = gsub(" $", "", x)
   return(x)
}
# clean texts
maruti_clean = clean.text(maruti_txt)
hyundai_clean = clean.text(hyundai_txt)
tata_clean = clean.text(tata_txt)
toyota_clean = clean.text(toyota_txt)
maruti = paste(maruti_clean, collapse=" ")
hyundai= paste(hyundai_clean, collapse=" ")
tata= paste(tata_clean, collapse=" ")
toyota= paste(toyota_clean, collapse=" ")
# put ehyundaiything in a single vector
all = c(maruti, hyundai, tata, toyota)
# remove stop-words
all = removeWords(all,
c(stopwords("english"), "maruti", "tata", "hyundai", "toyota"))
# create corpus
corpus = Corpus(VectorSource(all))
# create term-document matrix
tdm = TermDocumentMatrix(corpus)
# convert as matrix
tdm = as.matrix(tdm)
# add column names
colnames(tdm) = c("MARUTI", "HYUNDAI", "TATA", "TOYOTA")
# comparison cloud
comparison.cloud(tdm, random.order=FALSE,colors = c("#00B2FF", "red",     #FF0099","#6600CC"),max.words=500)

but getting following error

Error in strwidth(words[i], cex = size[i], ...) : invalid 'cex' value
please help

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

tm_map has parallel::mclapply error in R 3.0.1 on Mac

How to exactly remove the punctuation when using R with tm package

Text mining- how to build a term-document matrix

remove emoticons in R using tm package

R's tm package for word count

R remove stopwords from a character vector using %in%

Shiny app fails with "argument 1 (type 'closure') cannot be handled by 'cat'" - what does this mean?

Unable to convert a Corpus to Data Frame in R

transformation drops documents error in R

Are Snowball & SnowballC packages different in R?