Self-Organising Map component

by John Harding

Mar 3, 2016

Please see here: https://www.food4rhino.com/app/kohonen-map

Max Marschall

Hi John,

Can I use the Kohonen Map component WITHOUT losing diversity?

I'm trying to organize a multi-dimensional phase space of a stadium design and create a fitness landscape in which similar variations are more or less close to one another. I tried with 3 parameters and the results looked promising:

However, when I did a test with 4 parameters, it seemed like higher values on one parameter automatically lead to higher values on another:

It looks like a lot of variation is missing, although I remapped the values to the domains I wanted for my inputs:

Is what I want to do even possible? And is the SOM component the way to go? Would be great to get some advice!

Cheers,

Max

Feb 18, 2016
John Harding

Hi Max,

Seems a bit strange that it's not representing the domain very well, but I suspect you might need a higher learning rate (winlearn and learn).

See the attached 4 dimensional example which seems to handle 4 inputs quite well with a 0.95 and 0.9x learning rate respectively. Perhaps you can compare this to your example.

Graph:

Output:

Thanks for trying out the component and let me know how you get on.

Using SOMs to visualise the design space for high dimensional models has a lot of potential.

Best wishes,

John.

Feb 19, 2016
Max Marschall

Hi again,

That definitely cleared things up. I guess I was using the Kohonen Map for the wrong purpose: Imagine taking your original, random map (t=0), and simply reorganizing all of those different little boxes (without changing their appearance) to resemble a distribution looking a bit more like t=100. I was trying to visualize a parameter space in a way that doesn't have a fractal look, the purpose being to better identify the "neighbors" of a variation.

That would probably mean defining as many inputs as there are points in the map. I realize of course the difficulty of such a task, but could you please write something about the limits?

Here is a little test I just did. Defining too many inputs causes the algorithm to fail. If I turn down the learning rates it seems to work better, however at a certain point it can't seem to include all the inputs.

In any case a great and useful tool, thanks for sharing!

Cheers,

Max

Feb 19, 2016
John Harding
Hi Max,

Setting the size of the map compared to the number of inputs seems to be a bit of a dark art. In most examples you'll commonly see around 5 to 10 times as many map nodes as there are inputs, but this also depends on the learning rates.

In this example a map has been trained with 200 inputs on an 8x8 map. The learning rates appear to be quite low to get this to work. So in theory, the approach you have adopted should be achievable but you need more control over the learning decay rates.

So the issue at the moment here lies with how much I make explicit, for instance the following parameters are not exposed but probably should be:
1. Initial Euclidean radius of influence when a winning node is 'fired'
2. Decay rate of this radius
3. Decay rate of 'winLearn' multiplier
4. Decay rate of 'learn' multiplier
The decay rates essentially cool the map down to some kind of equilibrium. Obviously these really need to be parameters you can modify... particularly the last two. I'll get onto this now.

By the way, a very good tutorial is located here that goes through the process step by step.

John.
Feb 22, 2016
John Harding
Learning exponential decay constants now exposed for tweaking:
- WinLearnDecay: Winner learning decay constant.
- LearnDecay: Neighbour nodes learning decay constant.
- NeighDecay: Decay constant for size of neighbourhood influence (starts at half the map size). Note: this size is also now given as the output 'Neigh'.
Mar 4, 2016
djordje

Wonderful work John!!
Thank you for sharing it.

Mar 4, 2016
Max Marschall

Hi John,

I've been trying to push the number of inputs, so far with limited success...

I'm thinking that in order to achieve that, I might need to control the

- initial neighborhood radius and

- number of iterations (what is the current criteria for convergence?)

Does that make sense? Would it be possible to make those explicit?

Cheers,

Max

Mar 15, 2016
John Harding

Random seed input now added to help tune your maps better. Please note there was a small bug when using the decay rates which has now been corrected. Thank you Max for bringing it to my attention.

Both example files have been updated to suit this new release (0.1.3). I'll put the source on git and do some proper version control in due course.

Mar 18, 2016