algorithmic modeling for Rhino
Hi all,
Here is an implementation of K-Means for grasshopper I wrote for an aspect of my thesis.
What does it do? K-Means Algorthims are a automated way of grouping data. For now my implementation only works on points but other types should work in theory. The VB script is uses alot of loops and i am sure that is far from optimzed.
Check out the K-Means Page on wikipedia
And the general page on data clustering...
The code goes like this:
Private Sub RunScript(ByVal P As DataTree(Of Point3d), ByVal N As DataTree(Of Integer), ByVal D As Double, ByRef G As Object)
'your code goes here…
Dim t As Integer = 0
Dim i As Integer = 0
Dim j As Integer = 0
Dim k As Integer = 0
Dim numGroups As Integer
Dim groupNum As Integer
Dim X As Double
Dim Y As Double
Dim Z As Double
Dim minDist As Double
Dim dist As Double
'Dim cluster As Integer
'initialize random number generator
Dim r As New Random(System.DateTime.Now.Millisecond)
Dim goupings As New List(Of List(Of Integer))
Dim oldCentroids As New List(Of Point3d)
Dim newCentroids As New List(Of Point3d)
Dim pathList As New List(Of GH_Path)
Dim newCentroid As Point3d
Dim pt As Point3d
Dim isStillMoving As Boolean
Dim groupingsTree As New DataTree(Of Point3d)
Dim inputpts As New List(Of Point3d)
For t = 0 To P.BranchCount - 1
'this mess dose data matching for inputs P and N
'this if is ture is the input branches are equal
If N.BranchCount >= P.BranchCount Then
'check to see if, in the case the branches are equal, that there is only one item in each branch of N
If N.Branch(t).Count = 1 Then
numGroups = N.Branch(t)(0)
'I don't know what to do if the above statment isn't true so let's exit the program
Else
Print("Error: data missmatch between inputs P and N, Please reformat inputs")
Exit Sub
End If
'here we check if it is a simple list...
Else If N.Branch(0).Count = P.BranchCount
numGroups = N.Branch(0)(t)
'here we check if it is a simple list but is shorter or longer than the tree P
Else If N.Branch(0).Count < P.BranchCount Then
If t <= N.Branch(0).Count - 1 Then
numGroups = N.Branch(0)(t)
Else
numGroups = N.Branch(0)(N.Branch(0).Count - 1)
End If
'if all else fails exit the program!
Else
Print("Error: data missmatch between inputs P and N, Please reformat inputs")
Exit Sub
End If
inputpts = P.Branch(t)
print("--------Accessing Branch " & t & "--------")
'***************** Initialize K-Means Algorithm ****************************
print("--------Initializing " & numGroups & " groupings-------")
For i = 0 To numGroups - 1
'Dim pt As Point3d
'Dim group As New List(Of Integer)
Dim tempPath As New GH_Path
goupings.Add(New List(Of Integer))
newCentroids.add(pt)
oldCentroids.Add(pt)
tempPath.FromString(t & ";" & i)
pathList.Add(tempPath)
print("creating grouping " & i)
Next
'radomly assign points from the input list (intialze the k-mean groups)
print("--------Initializing group assinments (random)-------")
For j = 0 To inputpts.Count - 1
groupNum = RandomNumber(r, numGroups - 1)
goupings(groupNum).Add(j)
print("Adding index value " & goupings(groupNum)(goupings(groupNum).Count - 1) & " to group " & groupNum & ".")
Next
'***************** Do K-Means Algorithm ************************************
print("--------entering main k-means loop--------")
isStillMoving = True
Do While isStillMoving
isStillMoving = False
'find new centroids
print(".....finding new centroids")
For i = 0 To numGroups - 1
oldCentroids(i) = newCentroids(i)
For j = 0 To goupings(i).Count - 1
pt = inputpts(goupings(i)(j))
X = X + pt.X
Y = Y + pt.Y
Z = Z + pt.Z
Next
X = X / (j + 1)
Y = Y / (j + 1)
Z = Z / (j + 1)
newCentroid.X = X
newCentroid.Y = Y
newCentroid.Z = Z
newCentroids(i) = newCentroid
print("new centroid for group " & i & " is " & newCentroids(i).ToString)
goupings(i).Clear
dist = newCentroid.DistanceTo(oldCentroids(i))
'print("dist = " & dist)
If dist > D Then
print("need to do more work")
isStillMoving = True
End If
Next
'create new groupings
print(".....creating new groupings")
For j = 0 To inputpts.Count - 1
pt = inputpts(j)
minDist = pt.DistanceTo(newCentroids(0))
For i = 0 To numGroups - 1
dist = pt.DistanceTo(newCentroids(i))
If minDist >= dist Then
groupNum = i
minDist = dist
'print(minDist)
End If
Next
'Print("Adding index value " & j & " to group " & groupNum & ".")
goupings(groupNum).Add(j)
If isStillMoving = False Then
groupingsTree.Add(pt, pathList(groupNum))
End If
Next
If isStillMoving = False Then
print("***************all groups setled exiting loop********************")
print("***************total loop count is " & k & "************************")
End If
'a hard limit on the number of loops to execute....
If k > 1000 Then
isStillMoving = False
End If
k = k + 1
Loop
goupings.Clear
newCentroids.Clear
oldCentroids.Clear
pathList.Clear
Next
G = groupingsTree
End Sub
'<Custom additional code>
'function taken from http://www.freevbcode.com/ShowCode.Asp?ID=4451 (and slightly modified)
Public Function RandomNumber(ByVal r As Random, ByVal MaxNumber As Integer, Optional ByVal MinNumber As Integer = 0) As Integer
'if passed incorrect arguments, swap them
'can also throw exception or return 0
If MinNumber > MaxNumber Then
Dim t As Integer = MinNumber
MinNumber = MaxNumber
MaxNumber = t
End If
Return r.Next(MinNumber, MaxNumber)
End Function
... So any thoughts?
Tags:
Very interesting work. Thank you for sharing.
I was wondering if this might be something that could work with a feature I was thinking of developing in Kangaroo.
Instead of trying to tile a freeform surface with identical panels (which in general is not possible), I'm looking at whether one could work with a small set of tile shapes (say 6 triangles of different proportions), and have all the panels (starting off all being different) 'snap' their proportions to the closest one in that set.
Maybe this technique could be useful in determining which 6 triangles will best match the existing variations.
Yes, that's pretty much what I was thinking of.
So at each iteration every triangle would try and adjust itself to become a little closer to one of the 7 target triangles, they would all interact and pull on each other, and redistribute slightly across the surface.
Making the new groups (and taking the average of each group as the target for the members of that group) might only need to be something that happens every 100 iterations or so, depending on how heavy it is to calculate.
Does this sound like something where this would be applicable ?
Yea that is how projects like the Carlos Slim Museum have been rationalised. Basically after panelising the surface every panel was unique, so they used k-means clustering to identify families of panels. In the end they had something like 21 panels, although there was a control gap between the panels so they didn't have to touch edge to edge.
The pseudo-code for grouping panels into n groups is:
1. Randomly choose n panels, these will be the representative of each group.
2. For the rest of the panels, assign them to the group where the representative most closely matches the panel (this algorithm can be tricky, but something like difference between points sometimes works)
3. For each group find the most representative panel (the average panel of that group)
4. Go back to 2 and repeat until the representative of the groups is no longer changing.
It is certainly a practical way to make families of panels after the surface has been panelised. I have not seen anyone do it while relaxation is happening, but if there is one person to make it happen, Daniel Piker is it!
Thanks Daniel for the explanation, this is very helpful.
Were you involved in that project ? or is a description of this process published somewhere ?
Hi Daniel,
I don't think it is published, but Alex Pena was involved in the project when he was working for Gehry Technology. The clustering was done in Excel and it took hours to run, but I think if you did it properly (and didn't run it on thousands of panels) it could be quite efficient.
Daniel
Daniel,
I've gotten pretty far with this using Kangaroo as you were proposing! Problem is I don't have an adequate "spatial" description of triangle uniqueness to get effective k means clustering output (see column 2, row 2).
I have been working with various combinations of triangle sides and angle combinations. It's a bit tricky but at least it's pretty easy to see what's going on. Still pondering...
Hey Taz,
Very cool to see! And it looks like you are getting some reasonable results.
Yea the spatial description is a tricky one - k-means is a classic garbage-in, garbage-out algorithm. The description will depend on what you are trying to achieve, but for most manufacturing purposes you need to know the maximum distance a panel deviates from its cluster's prototype panel. Assuming you have access to the k-means algorithm you could measure this distance as:
1. Align centroid of the candidate panel with the centroid of the prototype panel.
2. Rotate the candidate until a vertex on this panel is in line with a vertex on the prototype.
3. Measure the distance between vertexes on the panel - the highest being the distance for this orientation.
4. Rotate the candidate panel into the two other possible positions - the lowest orientation distance being the overall distance the candidate panel deviates from the prototype.
5. Repeat with all the other panels.
Daniel
Hi Daniel D,
First off, thanks for posting your k-means definition. It's been a great help!
I was actually vaguely following your pseudo-code to work out a semi-manual GH/Kangaroo routine that I was hoping would lead to convergence with a discrete number of unique panels. I think, in addition to your advice above, I need to rethink my approach.
This is what I have implemented:
1. Group all triangles into n groups.
2. Derive representative (average) triangle for each group.
3. Repopulate surface with corresponding avg triangle
4. Formfind with Kangaroo to regenerate base mesh
5. Repeat until triangle vertices are within tolerance.
I'll post some action photos of problem areas, but basically when I don't get a good grouping the avg tri doesn't fit very well!
Hello Taz, would you mind to share the definition for this work?
© 2020 Created by Scott Davidson. Powered by