Grasshopper

algorithmic modeling for Rhino

# K-Means

Hi all,

Here is an implementation of K-Means for grasshopper I wrote for an aspect of my thesis.

What does it do? K-Means Algorthims are a automated way of grouping data. For now my implementation only works on points but other types should work in theory. The VB script is uses alot of loops and i am sure that is far from optimzed.

Check out the K-Means Page on wikipedia

And the general page on data clustering...

The code goes like this:

Private Sub RunScript(ByVal P As DataTree(Of Point3d), ByVal N As DataTree(Of Integer), ByVal D As Double, ByRef G As Object)
'your code goes here…
Dim t As Integer = 0
Dim i As Integer = 0
Dim j As Integer = 0
Dim k As Integer = 0
Dim numGroups As Integer
Dim groupNum As Integer
Dim X As Double
Dim Y As Double
Dim Z As Double
Dim minDist As Double
Dim dist As Double
'Dim cluster As Integer
'initialize random number generator
Dim r As New Random(System.DateTime.Now.Millisecond)
Dim goupings As New List(Of List(Of Integer))
Dim oldCentroids As New List(Of Point3d)
Dim newCentroids As New List(Of Point3d)
Dim pathList As New List(Of GH_Path)
Dim newCentroid As Point3d
Dim pt As Point3d
Dim isStillMoving As Boolean
Dim groupingsTree As New DataTree(Of Point3d)
Dim inputpts As New List(Of Point3d)

For t = 0 To P.BranchCount - 1

'this mess dose data matching for inputs P and N
'this if is ture is the input branches are equal
If N.BranchCount >= P.BranchCount Then
'check to see if, in the case the branches are equal, that there is only one item in each branch of N
If N.Branch(t).Count = 1 Then
numGroups = N.Branch(t)(0)
'I don't know what to do if the above statment isn't true so let's exit the program
Else
Print("Error: data missmatch between inputs P and N, Please reformat inputs")
Exit Sub
End If
'here we check if it is a simple list...
Else If N.Branch(0).Count = P.BranchCount
numGroups = N.Branch(0)(t)
'here we check if it is a simple list but is shorter or longer than the tree P
Else If N.Branch(0).Count < P.BranchCount Then
If t <= N.Branch(0).Count - 1 Then
numGroups = N.Branch(0)(t)
Else
numGroups = N.Branch(0)(N.Branch(0).Count - 1)
End If
'if all else fails exit the program!
Else
Print("Error: data missmatch between inputs P and N, Please reformat inputs")
Exit Sub
End If

inputpts = P.Branch(t)
print("--------Accessing Branch " & t & "--------")
'***************** Initialize K-Means Algorithm ****************************
print("--------Initializing " & numGroups & " groupings-------")

For i = 0 To numGroups - 1
'Dim pt As Point3d
'Dim group As New List(Of Integer)
Dim tempPath As New GH_Path

tempPath.FromString(t & ";" & i)
print("creating grouping " & i)
Next
'radomly assign points from the input list (intialze the k-mean groups)
print("--------Initializing group assinments (random)-------")
For j = 0 To inputpts.Count - 1
groupNum = RandomNumber(r, numGroups - 1)
print("Adding index value " & goupings(groupNum)(goupings(groupNum).Count - 1) & " to group " & groupNum & ".")
Next

'***************** Do K-Means Algorithm ************************************
print("--------entering main k-means loop--------")
isStillMoving = True
Do While isStillMoving
isStillMoving = False

'find new centroids
print(".....finding new centroids")
For i = 0 To numGroups - 1
oldCentroids(i) = newCentroids(i)
For j = 0 To goupings(i).Count - 1
pt = inputpts(goupings(i)(j))
X = X + pt.X
Y = Y + pt.Y
Z = Z + pt.Z
Next
X = X / (j + 1)
Y = Y / (j + 1)
Z = Z / (j + 1)
newCentroid.X = X
newCentroid.Y = Y
newCentroid.Z = Z
newCentroids(i) = newCentroid
print("new centroid for group " & i & " is " & newCentroids(i).ToString)
goupings(i).Clear
dist = newCentroid.DistanceTo(oldCentroids(i))
'print("dist = " & dist)
If dist > D Then
print("need to do more work")
isStillMoving = True
End If
Next

'create new groupings
print(".....creating new groupings")
For j = 0 To inputpts.Count - 1
pt = inputpts(j)
minDist = pt.DistanceTo(newCentroids(0))
For i = 0 To numGroups - 1
dist = pt.DistanceTo(newCentroids(i))
If minDist >= dist Then
groupNum = i
minDist = dist
'print(minDist)
End If

Next
'Print("Adding index value " & j & " to group " & groupNum & ".")
If isStillMoving = False Then
End If
Next

If isStillMoving = False Then
print("***************all groups setled exiting loop********************")
print("***************total loop count is " & k & "************************")
End If

'a hard limit on the number of loops to execute....
If k > 1000 Then
isStillMoving = False
End If

k = k + 1
Loop
goupings.Clear
newCentroids.Clear
oldCentroids.Clear
pathList.Clear
Next
G = groupingsTree
End Sub

'function taken from http://www.freevbcode.com/ShowCode.Asp?ID=4451 (and slightly modified)
Public Function RandomNumber(ByVal r As Random, ByVal MaxNumber As Integer, Optional ByVal MinNumber As Integer = 0) As Integer

'if passed incorrect arguments, swap them
'can also throw exception or return 0

If MinNumber > MaxNumber Then
Dim t As Integer = MinNumber
MinNumber = MaxNumber
MaxNumber = t
End If

Return r.Next(MinNumber, MaxNumber)

End Function

... So any thoughts?

Views: 8928

Attachments:

### Replies to This Discussion

This paper looks like it could be useful research.  More to digest...

http://faculty.cs.tamu.edu/schaefer/research/equivalence.pdf

Hi taz, i was wondering if you already did a script component with this new k-means++ algorithm?

"Edge sharing graph" and "ﬂexibility of a KT-surface" as keywords to understand the computational problem behind. The similarity search problem is the other one, but less complex perhaps.

As far as Daniel Davis said (and I understood), Carlos Slim Museum doesn't deal with the "edge sharing problem" at all (or at least not trying to solve it completely), it just use the building process tolerance to make a functional system.

Nice reading and awesome algorithms to understand and try to code. This brings back to my mind the Grasshopper Hackaton Event proposing the most interesting problems seen in the forum as candidates :P

There's also some nice recent work on this from Leif Kobbelt's group at Aachen:
http://www.rwth-graphics.de/software/zometool

Hi all,

I have been attempting something similar to what Taz has done.

I would be interested in what Taz and others have achieved in terms of gap tolerances between each panel as well as the approaches to minimising this gap. What is achievable???

I am wondering if I am currently too ambitious with my tolerances - I am currently using 60 unique tri panels from 1500 on doubly curved surfaces and getting a maximum vertex deviation of 80mm which is approx. 2.5% of the panel edge length, the average deviation is much lower at about 10mm but it is the maximum which obviously defines whether the solution is viable in terms of construction.

From my attempts at this problem I find that a loop of simply re-clustering with relaxation is not sufficient for my purposes. My approach has been to introduce the "worst" panels as seeds to clustering at each iteration.

Looking forward to your responses!

Cheers,

Sam

Very good and useful job.

I put here an implementation for GH1.0

https://discourse.mcneel.com/t/clustering-points/76907/5

How can I extract the points from the final centroids? Sorry but I do not dominate "VB"

• View All

## Videos

• ### Jamparc's PQ Mesh for Convex Sections

Added by Jaeman PARK

• ### Rhino 3D Cycles Demo: background image integration

Added by Sim Pern Chong