algorithmic modeling for Rhino
I did the simple part of making the point locations editable on the simple backdrop example, but I'm not sure how to make the surface work in any size box rather than the 0 to 10 box the example starts with.
Can anyone help me out on that?
Hey - sorry - I totally missed your post somehow.
This is the key part to get it to work with differently sized ranges of numbers.
The remap components scale the numbers from the 0-10 range to the default one in the T parameter -> 0-1. That is totally not necessary for the network to learn, but it makes it WAY faster.
The process of "scaling" the numbers to conform the 0-1 range is called normalization and it's one of the key speed-up methods in machine learning. Of course you have to know up front what is the minimal and maximal value your data will fit into.
While remapping/normalization of the training input data is not necessary, you have to be aware the output data has to be in this range. This simply comes from the fact the neurons in each layer produce the final output value based on the activation function, and in case of the sigmoid function it can be from 0 to 1 as well.
In the picture below you have a plot of the standard sigmoid function (this one is called logistic function), and you can see how the input X can be of any value, while the output Y is limited to 0-1.
So to sum up:
Inputs should be normalized to 0-1, in the example those values initially are in the 0-10 range.
Training outputs HAVE to be in this range simply because the function of a neuron is not able to reach any value outside of that range.
Hi Mateusz Great Plugin, cant wait to see how the community will develop.
I've been playing with the examples you provided and wanted to add an example for MNIST learning.
Unfortunately the BackProp is returning a single value tensor, what Am I doing wrong?
Made some adjustments to you file, also added a few comments.
The main problem is that you're trying to teach network with one output to recognize all of the ten cases (0-9), which is simply not possible. Instead what you have to do is to construct a network with 10 outputs, where each one of them indicates what's the probability of the input being this digit. The keyword here is "classification", cause you want to get a "discrete" output value. With the continuous value estimation (for instance - in case of mnist that could be an output saying what percentage of pixels is white), the network performs regression. Again - your case is a classification case, and you tried to solve it with some regression strategy :)
Maybe you've also seen the Tensors with "Tensor ()" description, that is a bug in the way the Tensors are loaded from the file, not really crucial here, but I made some adjustment to fix it as well.
Last but not least: the accord framework (thats the backbone of the network/backprop components in Owl) is not really the fastest one out there, and even such a "simple" dataset like MNIST will take a couple of minutes to train. Please use Anemone to make your life easier, and keep the network in the loop... you can also see that way how it gets better over time - get the Compute in the loop to see the values changing with each iteration.
Once the setup is good, it will still take some time to adjust the parameters for training. Those are:
This is where you reach the point known as knob tuning, the very essence of machine learning ;)
(It can be really frustrating, but it gets better over time when you develop some kind of intuition, there are no ready made solutions for that part)
I've fixed the "Tensor ()" bug, it will be fine in the next release. Currently it's not actually a problem, just a small inconvenience. Any definition you make now will work the same after the fix.
Thanks for the quick reply Mateusz,
You are right, For some reason I though I could solve the network with one node at the end as a 0 to 1.0 vector to act as a classifier.
I've taken on board your suggestions, and tried again, this time with anemone to manage the cycles and lowered the Layers to 10-10 seems to work fine.
I think the setup is correct however is not giving me the expected results. Ive compared it to a Crow Backprop (Attachment) on the top which classifies correctly with the same input data @ 2000-3000 cycles in about 5 min.
is this a difference between accord and neurondotnet or the way the backprop component computes?
Hey! Super basic question, but what is the goal of the 6D Kmeans example? It appears to retain a cube shape, and fill it with boxes, but I was hoping to just get a better overview, I am very new to this subject!
I find this video quite good for explanation on how kmeans works.
4/5/6...N D means you can have points with as many coordinates as you need, instead of RGB you can use CMYK model (4D points) and achieve the same as in the video, yet with 4 coordinates.
As an architecture student, I would love to know how I could possibly use this plugin. Are there any problems within architecture modelling that are preferably solved with machine learning. Of course, I'm not asking for a full script, but on ideas how I can use this interesting plugin to my advantage.