How to calculate BIC for k-means clustering in R
Solution 1
For anyone else landing here, there's a method proposed by Sherry Towers at http://sherrytowers.com/2013/10/24/k-means-clustering/, which uses output from stats::kmeans
. I quote:
The AIC can be calculated with the following function:
kmeansAIC = function(fit){ m = ncol(fit$centers) n = length(fit$cluster) k = nrow(fit$centers) D = fit$tot.withinss return(D + 2*m*k) }
From the help for stats::AIC
, you can also see that the BIC can be calculated in a similar way to the AIC. An easy way to get the BIC is to replace the return()
in the above function, with this:
return(data.frame(AIC = D + 2*m*k,
BIC = D + log(n)*m*k))
So you would use this as follows:
fit <- kmeans(x = data,centers = 6)
kmeansAIC(fit)
Solution 2
To compute BIC, just add .5*k*d*log(n)
(where k
is the number of means, d
is the length of a vector in your dataset, and n
is the number of data points) to the standard k-means error function.
The standard k-means penalty is \sum_n (m_k(n)-x_n)^2
, where m_k(n)
is the mean associated with the nth data point. This penalty can be interpreted as a log probability, so BIC is perfectly valid.
BIC just adds an additional penalty term to the k-means error proportional to k
.
Solution 3
Just to add to what user1149913 said (I don't have enough reputation to comment), since you're using the kmeans function in R, \sum_n (m_k(n)-x_n)^2
is already calculated for you as KClData$tot.withinss
.
Solution 4
Rather than reimplementing AIC
or BIC
, we can define a log-likelihood function for kmeans
objects; this will then get used by the BIC
function in the stats
package.
logLik.kmeans <- function(object) structure(
-object$tot.withinss/2,
df = nrow(object$centers)*ncol(object$centers),
nobs = length(object$cluster)
)
Then to use it, call BIC
as normal. For example:
example(kmeans, local=FALSE)
BIC(cl)
# [1] 26.22842084
This method will be provided in the next release of the stackoverflow
package.
UnivStudent
Updated on July 05, 2022Comments
-
UnivStudent 11 months
I've been using k-means to cluster my data in R but I'd like to be able to assess the fit vs. model complexity of my clustering using Baysiean Information Criterion (BIC) and AIC. Currently the code I've been using in R is:
KClData <- kmeans(Data, centers=2, nstart= 100)
But I'd like to be able to extract the BIC and Log Likelihood. Any help would be greatly appreciated!
-
Jason almost 8 yearsI don't think the k-means penalty
\sum_n (m_k(n) - x_n)^2
(or the negative of that) is the log-likelihood. The log-likelihood should have 3 more terms:-n*log(K)
,-0.5*n*d*log(2*pi)
and-n*d*log(\sigma)
, where \sigma is the common std for all Gaussians. Also, the "k" in the BIC formula is not the number of clusters, it is the number of free parameters in the mixture Gaussian model, so k should be:k = K-1 + 1 + K*d = K*(d+1)
. Here I am using lower k for the BIC parameter term, and capital K as the number of means/clusters. -
Jason almost 8 years
K-1
is K-1 number of weights for the Gaussians, since the weights add up to 1, the dof is K-1.1
for the one common std for all Gaussians. AndK*d
is the number of parameters for the coordinates of the cluster centers. -
pengchy over 7 yearsWaiting for the new version of stackoverflow, current version 0.1.2 hasn't implemented this function.
-
pengchy over 7 yearsI have used your methods, but the BIC value of the kmeans results always monotone decreasing with the cluster number increasing. Please look the posts: stats.stackexchange.com/questions/55147/…
-
Andy Clifton over 7 yearsTaking a wild guess, I'd say there's an error somewhere.
-
pengchy over 7 yearsThank you Any Clifton, I have tested the BIC with higher K number, when K reach 155 the BIC come to the smallest value. Originally, I have only tested the K with maximum 50.
-
Andy Clifton over 7 years@pengchy: you may be interested in this answer about finding optimum values of k: stackoverflow.com/a/18047359/2514568
-
pengchy over 7 yearsHi Andy Clifton, thank you for the information. It is very helpful.
-
Neal Fultz over 7 years@pengchy the github version has it, I'll submit to cran when I get the chance.
-
user5054 almost 7 yearsAre there any weights in k-means? I think not.. So why do we need K-1?
-
EngrStudent over 5 yearsyou are sign inverted. You need a minus sign.
-
Chris almost 2 years