How can I get cluster number correspond to data using k-means clustering techniques in R?
Solution 1
It sounds like you are trying to access the cluster vector that is returned by kmeans()
. From the help page for cluster:
A vector of integers (from 1:k) indicating the cluster to which each
point is allocated.
Using the example on the help page:
x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
colnames(x) <- c("x", "y")
(cl <- kmeans(x, 2))
#Access the cluster vector
cl$cluster
> cl$cluster
[1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[45] 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[89] 1 1 1 1 1 1 1 1 1 1 1 1
To address the question in the comments
You can "map" the cluster number to the original data by doing something like this:
out <- cbind(x, clusterNum = cl$cluster)
head(out)
x y clusterNum
[1,] -0.42480483 -0.2168085 2
[2,] -0.06272004 0.3641157 2
[3,] 0.08207316 0.2215622 2
[4,] -0.19539844 0.1306106 2
[5,] -0.26429056 -0.3249288 2
[6,] 0.09096253 -0.2158603 2
cbind
is the function for column bind, there is also an rbind
function for rows. See their help pages for more details ?cbind
and ?rbind
respectively.
Solution 2
@ Java questioner
You can access the cluster data as followed:
> data_clustered <- kmeans(data)
> data_clustered$cluster
data_clustered$cluster
is a vector with the length of the original number of records in data. Each entry is for the that row.
To get all the records belonging to cluster 1:
> data$cluster <- data_clustered$cluster
> data_clus_1 <- data[data$cluster == 1,]
Number of clusters:
> max(data$cluster)
Good luck with your clustering
Java questioner
Updated on November 28, 2020Comments
-
Java questioner over 3 years
I clustered data by k-means clustering method, how can i get cluster number correspond to data using k-means clustering techniques in R? In order to get each record belongs to which cluster.
example
12 32 13 => 1. 12,13 2. 32