How to create a cluster plot in R?

31,068

Did you mean something like this? Sorry but i know nothing about HTML5 Canvas, only R... But I hope it helps...

First I cluster the data using kmeans (note that I did not cluster the distance matrix), than I compute the distance matix and plot it using cmdscale. Then I add colors to the MDS-plot that correspond to the groups identified by kmeans. Plus some nice additional graphical features.

You can access the coordinates from the object created by cmdscale.

### some sample data
require(vegan)
data(dune)

# kmeans
kclus <- kmeans(dune,centers= 4, iter.max=1000, nstart=10000)

# distance matrix
dune_dist <- dist(dune)

# Multidimensional scaling
cmd <- cmdscale(dune_dist)

# plot MDS, with colors by groups from kmeans
groups <- levels(factor(kclus$cluster))
ordiplot(cmd, type = "n")
cols <- c("steelblue", "darkred", "darkgreen", "pink")
for(i in seq_along(groups)){
  points(cmd[factor(kclus$cluster) == groups[i], ], col = cols[i], pch = 16)
}

# add spider and hull
ordispider(cmd, factor(kclus$cluster), label = TRUE)
ordihull(cmd, factor(kclus$cluster), lty = "dotted")

enter image description here

Share:
31,068
slotishtype
Author by

slotishtype

Comp Science Researcher

Updated on July 05, 2022

Comments

  • slotishtype
    slotishtype almost 2 years

    How can I create a cluster plot in R without using clustplot?

    I am trying to get to grips with some clustering (using R) and visualisation (using HTML5 Canvas).

    Basically, I want to create a cluster plot but instead of plotting the data, I want to get a set of 2D points or coordinates that I can pull into canvas and do something might pretty with (but I am unsure of how to do this). I would imagine that I:

    1. Create a similarity matrix for the entire dataset (using dist)
    2. Cluster the similarity matrix using kmeans or something similar (using kmeans)
    3. Plot the result using MDS or PCA - but I am unsure of how steps 2 and 3 relate (cmdscale).

    I've checked out questions here, here and here (with the last one being of most use).

  • slotishtype
    slotishtype over 12 years
    Thanks @EDi, that is really great. So, just to clarify, you cluster and then build a similarity matirx. You then use MDS to position the points in 2D and THEN you colour the points by their relationships to the cluster. Brilliant. If you have a chance, could you explain what this does: groups <- levels(factor(kclus$cluster))
  • EDi
    EDi over 12 years
    see my edit. groups is just an objekt that contains the names of the groups, only used for the for-loop.
  • slotishtype
    slotishtype over 12 years
    Ok I see your edit. One last question, can you cluster the distance matrix or is that a crazy move? Sorry, learning at the moment and just working my way through things.