# How to calculate BIC for k-means clustering in R

21,572

## Solution 1

For anyone else landing here, there's a method proposed by Sherry Towers at http://sherrytowers.com/2013/10/24/k-means-clustering/, which uses output from stats::kmeans. I quote:

The AIC can be calculated with the following function:

kmeansAIC = function(fit){

m = ncol(fit$centers) n = length(fit$cluster)
k = nrow(fit$centers) D = fit$tot.withinss
return(D + 2*m*k)
}


From the help for stats::AIC, you can also see that the BIC can be calculated in a similar way to the AIC. An easy way to get the BIC is to replace the return() in the above function, with this:

return(data.frame(AIC = D + 2*m*k,
BIC = D + log(n)*m*k))


So you would use this as follows:

fit <- kmeans(x = data,centers = 6)
kmeansAIC(fit)


## Solution 2

To compute BIC, just add .5*k*d*log(n) (where k is the number of means, d is the length of a vector in your dataset, and n is the number of data points) to the standard k-means error function.

The standard k-means penalty is \sum_n (m_k(n)-x_n)^2, where m_k(n) is the mean associated with the nth data point. This penalty can be interpreted as a log probability, so BIC is perfectly valid.

BIC just adds an additional penalty term to the k-means error proportional to k.

Just to add to what user1149913 said (I don't have enough reputation to comment), since you're using the kmeans function in R, \sum_n (m_k(n)-x_n)^2 is already calculated for you as KClData$tot.withinss. ## Solution 4 Rather than reimplementing AIC or BIC, we can define a log-likelihood function for kmeans objects; this will then get used by the BIC function in the stats package. logLik.kmeans <- function(object) structure( -object$tot.withinss/2,
df = nrow(object$centers)*ncol(object$centers),
nobs = length(object\$cluster)
)


Then to use it, call BIC as normal. For example:

example(kmeans, local=FALSE)
BIC(cl)
#  26.22842084


This method will be provided in the next release of the stackoverflow package.

Share:
21,572 Author by

### UnivStudent

Updated on July 05, 2022

• UnivStudent 11 months

I've been using k-means to cluster my data in R but I'd like to be able to assess the fit vs. model complexity of my clustering using Baysiean Information Criterion (BIC) and AIC. Currently the code I've been using in R is:

KClData <- kmeans(Data, centers=2, nstart= 100)


But I'd like to be able to extract the BIC and Log Likelihood. Any help would be greatly appreciated!

• Jason almost 8 years
I don't think the k-means penalty \sum_n (m_k(n) - x_n)^2 (or the negative of that) is the log-likelihood. The log-likelihood should have 3 more terms: -n*log(K), -0.5*n*d*log(2*pi) and -n*d*log(\sigma), where \sigma is the common std for all Gaussians. Also, the "k" in the BIC formula is not the number of clusters, it is the number of free parameters in the mixture Gaussian model, so k should be: k = K-1 + 1 + K*d = K*(d+1). Here I am using lower k for the BIC parameter term, and capital K as the number of means/clusters.
• Jason almost 8 years
K-1 is K-1 number of weights for the Gaussians, since the weights add up to 1, the dof is K-1. 1 for the one common std for all Gaussians. And K*d is the number of parameters for the coordinates of the cluster centers.
• pengchy over 7 years
Waiting for the new version of stackoverflow, current version 0.1.2 hasn't implemented this function.
• pengchy over 7 years
I have used your methods, but the BIC value of the kmeans results always monotone decreasing with the cluster number increasing. Please look the posts: stats.stackexchange.com/questions/55147/…
• Andy Clifton over 7 years
Taking a wild guess, I'd say there's an error somewhere.
• pengchy over 7 years
Thank you Any Clifton, I have tested the BIC with higher K number, when K reach 155 the BIC come to the smallest value. Originally, I have only tested the K with maximum 50.
• Andy Clifton over 7 years
@pengchy: you may be interested in this answer about finding optimum values of k: stackoverflow.com/a/18047359/2514568
• pengchy over 7 years
Hi Andy Clifton, thank you for the information. It is very helpful.
• Neal Fultz over 7 years
@pengchy the github version has it, I'll submit to cran when I get the chance.
• user5054 almost 7 years
Are there any weights in k-means? I think not.. So why do we need K-1?
• EngrStudent over 5 years
• 